* [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev
@ 2016-06-06 14:59 Kevin Wolf
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
` (5 more replies)
0 siblings, 6 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
This series converts qcow2 to the byte-based I/O interfaces. This simplifies
the code by removing many unit conversions, and in the unlikely case of actual
unaligned requests, it even makes the driver work more efficiently by avoiding
read-modify-write.
v2:
- Be more careful with integer overflows [Eric]
- Updated some more comments [Eric]
- Rename copy_sectors -> do_perform_cow [Eric]
- Write single byte instead of full sector at preallocation end [Eric]
Kevin Wolf (5):
qcow2: Work with bytes in qcow2_get_cluster_offset()
qcow2: Implement .bdrv_co_preadv()
qcow2: Make copy_sectors() byte based
qcow2: Use bytes instead of sectors for QCowL2Meta
qcow2: Implement .bdrv_co_pwritev()
block/qcow2-cluster.c | 146 +++++++++++++++++--------------------
block/qcow2.c | 197 +++++++++++++++++++++++++-------------------------
block/qcow2.h | 18 ++---
trace-events | 6 +-
4 files changed, 176 insertions(+), 191 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset()
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
@ 2016-06-06 14:59 ` Kevin Wolf
2016-06-06 17:11 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
` (4 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
This patch changes the units that qcow2_get_cluster_offset() uses
internally, without touching the interface just yet. This will be done
in another patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 44 +++++++++++++++++++++++---------------------
1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index d901d89..c5906e6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -482,29 +482,28 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int l2_index;
uint64_t l1_index, l2_offset, *l2_table;
int l1_bits, c;
- unsigned int index_in_cluster, nb_clusters;
- uint64_t nb_available, nb_needed;
+ unsigned int offset_in_cluster, nb_clusters;
+ uint64_t bytes_available, bytes_needed;
int ret;
+ unsigned int bytes;
- index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
- nb_needed = *num + index_in_cluster;
+ assert(*num <= BDRV_REQUEST_MAX_SECTORS);
+ bytes = *num * BDRV_SECTOR_SIZE;
- l1_bits = s->l2_bits + s->cluster_bits;
-
- /* compute how many bytes there are between the offset and
- * the end of the l1 entry
- */
+ offset_in_cluster = offset_into_cluster(s, offset);
+ bytes_needed = (uint64_t) bytes + offset_in_cluster;
- nb_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1));
-
- /* compute the number of available sectors */
+ l1_bits = s->l2_bits + s->cluster_bits;
- nb_available = (nb_available >> 9) + index_in_cluster;
+ /* compute how many bytes there are between the start of the cluster
+ * containing offset and the end of the l1 entry */
+ bytes_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1))
+ + offset_in_cluster;
- if (nb_needed > nb_available) {
- nb_needed = nb_available;
+ if (bytes_needed > bytes_available) {
+ bytes_needed = bytes_available;
}
- assert(nb_needed <= INT_MAX);
+ assert(bytes_needed <= INT_MAX);
*cluster_offset = 0;
@@ -542,7 +541,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
*cluster_offset = be64_to_cpu(l2_table[l2_index]);
/* nb_needed <= INT_MAX, thus nb_clusters <= INT_MAX, too */
- nb_clusters = size_to_clusters(s, nb_needed << 9);
+ nb_clusters = size_to_clusters(s, bytes_needed);
ret = qcow2_get_cluster_type(*cluster_offset);
switch (ret) {
@@ -589,13 +588,16 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
- nb_available = (c * s->cluster_sectors);
+ bytes_available = (c * s->cluster_size);
out:
- if (nb_available > nb_needed)
- nb_available = nb_needed;
+ if (bytes_available > bytes_needed) {
+ bytes_available = bytes_needed;
+ }
- *num = nb_available - index_in_cluster;
+ bytes = bytes_available - offset_in_cluster;
+ assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+ *num = bytes >> BDRV_SECTOR_BITS;
return ret;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv()
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
@ 2016-06-06 14:59 ` Kevin Wolf
2016-06-06 21:32 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
` (3 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
Reading from qcow2 images is now byte granularity.
Most of the affected code in qcow2 actually gets simpler with this
change. The only exception is encryption, which is fixed on 512 bytes
blocks; in order to keep this working, bs->request_alignment is set for
encrypted images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 27 ++++++-------
block/qcow2.c | 108 +++++++++++++++++++++++++++-----------------------
block/qcow2.h | 2 +-
3 files changed, 72 insertions(+), 65 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c5906e6..48a8e13 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -424,7 +424,8 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
* interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled.
*/
- ret = bs->drv->bdrv_co_readv(bs, start_sect + n_start, n, &qiov);
+ ret = bs->drv->bdrv_co_preadv(bs, (start_sect + n_start) * BDRV_SECTOR_SIZE,
+ n * BDRV_SECTOR_SIZE, &qiov, 0);
if (ret < 0) {
goto out;
}
@@ -464,19 +465,21 @@ out:
/*
* get_cluster_offset
*
- * For a given offset of the disk image, find the cluster offset in
- * qcow2 file. The offset is stored in *cluster_offset.
+ * For a given offset of the virtual disk, find the cluster type and offset in
+ * the qcow2 file. The offset is stored in *cluster_offset.
*
- * on entry, *num is the number of contiguous sectors we'd like to
- * access following offset.
+ * On entry, *bytes is the maximum number of contiguous bytes starting at
+ * offset that we are interested in.
*
- * on exit, *num is the number of contiguous sectors we can read.
+ * On exit, *bytes is the number of bytes staring at offset that have the same
+ * cluster type and (if applicable) are stored contiguously in the image file.
+ * Compressed clusters are always returned one by one.
*
* Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
* cases.
*/
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
- int *num, uint64_t *cluster_offset)
+ unsigned int *bytes, uint64_t *cluster_offset)
{
BDRVQcow2State *s = bs->opaque;
unsigned int l2_index;
@@ -485,13 +488,9 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int offset_in_cluster, nb_clusters;
uint64_t bytes_available, bytes_needed;
int ret;
- unsigned int bytes;
-
- assert(*num <= BDRV_REQUEST_MAX_SECTORS);
- bytes = *num * BDRV_SECTOR_SIZE;
offset_in_cluster = offset_into_cluster(s, offset);
- bytes_needed = (uint64_t) bytes + offset_in_cluster;
+ bytes_needed = (uint64_t) *bytes + offset_in_cluster;
l1_bits = s->l2_bits + s->cluster_bits;
@@ -595,9 +594,7 @@ out:
bytes_available = bytes_needed;
}
- bytes = bytes_available - offset_in_cluster;
- assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
- *num = bytes >> BDRV_SECTOR_BITS;
+ *bytes = bytes_available - offset_in_cluster;
return ret;
diff --git a/block/qcow2.c b/block/qcow2.c
index 6f5fb81..545dea0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -975,6 +975,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
}
bs->encrypted = 1;
+
+ /* Encryption works on a sector granularity */
+ bs->request_alignment = BDRV_SECTOR_SIZE;
}
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
@@ -1331,16 +1334,20 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
BDRVQcow2State *s = bs->opaque;
uint64_t cluster_offset;
int index_in_cluster, ret;
+ unsigned int bytes;
int64_t status = 0;
- *pnum = nb_sectors;
+ bytes = MIN(INT_MAX, nb_sectors * BDRV_SECTOR_SIZE);
qemu_co_mutex_lock(&s->lock);
- ret = qcow2_get_cluster_offset(bs, sector_num << 9, pnum, &cluster_offset);
+ ret = qcow2_get_cluster_offset(bs, sector_num << 9, &bytes,
+ &cluster_offset);
qemu_co_mutex_unlock(&s->lock);
if (ret < 0) {
return ret;
}
+ *pnum = bytes >> BDRV_SECTOR_BITS;
+
if (cluster_offset != 0 && ret != QCOW2_CLUSTER_COMPRESSED &&
!s->cipher) {
index_in_cluster = sector_num & (s->cluster_sectors - 1);
@@ -1358,28 +1365,34 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
/* handle reading after the end of the backing file */
int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
- int64_t sector_num, int nb_sectors)
+ int64_t offset, int bytes)
{
+ uint64_t bs_size = bs->total_sectors * BDRV_SECTOR_SIZE;
int n1;
- if ((sector_num + nb_sectors) <= bs->total_sectors)
- return nb_sectors;
- if (sector_num >= bs->total_sectors)
+
+ if ((offset + bytes) <= bs_size) {
+ return bytes;
+ }
+
+ if (offset >= bs_size) {
n1 = 0;
- else
- n1 = bs->total_sectors - sector_num;
+ } else {
+ n1 = bs_size - offset;
+ }
- qemu_iovec_memset(qiov, 512 * n1, 0, 512 * (nb_sectors - n1));
+ qemu_iovec_memset(qiov, n1, 0, bytes - n1);
return n1;
}
-static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
- int remaining_sectors, QEMUIOVector *qiov)
+static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
+ uint64_t bytes, QEMUIOVector *qiov,
+ int flags)
{
BDRVQcow2State *s = bs->opaque;
- int index_in_cluster, n1;
+ int offset_in_cluster, n1;
int ret;
- int cur_nr_sectors; /* number of sectors in current iteration */
+ unsigned int cur_bytes; /* number of bytes in current iteration */
uint64_t cluster_offset = 0;
uint64_t bytes_done = 0;
QEMUIOVector hd_qiov;
@@ -1389,26 +1402,24 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
qemu_co_mutex_lock(&s->lock);
- while (remaining_sectors != 0) {
+ while (bytes != 0) {
/* prepare next request */
- cur_nr_sectors = remaining_sectors;
+ cur_bytes = MIN(bytes, INT_MAX);
if (s->cipher) {
- cur_nr_sectors = MIN(cur_nr_sectors,
- QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
+ cur_bytes = MIN(cur_bytes,
+ QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
}
- ret = qcow2_get_cluster_offset(bs, sector_num << 9,
- &cur_nr_sectors, &cluster_offset);
+ ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
if (ret < 0) {
goto fail;
}
- index_in_cluster = sector_num & (s->cluster_sectors - 1);
+ offset_in_cluster = offset_into_cluster(s, offset);
qemu_iovec_reset(&hd_qiov);
- qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
- cur_nr_sectors * 512);
+ qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
switch (ret) {
case QCOW2_CLUSTER_UNALLOCATED:
@@ -1416,18 +1427,17 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
if (bs->backing) {
/* read from the base image */
n1 = qcow2_backing_read1(bs->backing->bs, &hd_qiov,
- sector_num, cur_nr_sectors);
+ offset, cur_bytes);
if (n1 > 0) {
QEMUIOVector local_qiov;
qemu_iovec_init(&local_qiov, hd_qiov.niov);
- qemu_iovec_concat(&local_qiov, &hd_qiov, 0,
- n1 * BDRV_SECTOR_SIZE);
+ qemu_iovec_concat(&local_qiov, &hd_qiov, 0, n1);
BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
qemu_co_mutex_unlock(&s->lock);
- ret = bdrv_co_readv(bs->backing->bs, sector_num,
- n1, &local_qiov);
+ ret = bdrv_co_preadv(bs->backing->bs, offset, n1,
+ &local_qiov, 0);
qemu_co_mutex_lock(&s->lock);
qemu_iovec_destroy(&local_qiov);
@@ -1438,12 +1448,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
} else {
/* Note: in this case, no need to wait */
- qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
+ qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
}
break;
case QCOW2_CLUSTER_ZERO:
- qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
+ qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
break;
case QCOW2_CLUSTER_COMPRESSED:
@@ -1454,8 +1464,8 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
qemu_iovec_from_buf(&hd_qiov, 0,
- s->cluster_cache + index_in_cluster * 512,
- 512 * cur_nr_sectors);
+ s->cluster_cache + offset_in_cluster,
+ cur_bytes);
break;
case QCOW2_CLUSTER_NORMAL:
@@ -1482,34 +1492,34 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
}
- assert(cur_nr_sectors <=
- QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
+ assert(cur_bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_iovec_reset(&hd_qiov);
- qemu_iovec_add(&hd_qiov, cluster_data,
- 512 * cur_nr_sectors);
+ qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
}
BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
qemu_co_mutex_unlock(&s->lock);
- ret = bdrv_co_readv(bs->file->bs,
- (cluster_offset >> 9) + index_in_cluster,
- cur_nr_sectors, &hd_qiov);
+ ret = bdrv_co_preadv(bs->file->bs,
+ cluster_offset + offset_in_cluster,
+ cur_bytes, &hd_qiov, 0);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
goto fail;
}
if (bs->encrypted) {
assert(s->cipher);
+ assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+ assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
Error *err = NULL;
- if (qcow2_encrypt_sectors(s, sector_num, cluster_data,
- cluster_data, cur_nr_sectors, false,
- &err) < 0) {
+ if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
+ cluster_data, cluster_data,
+ cur_bytes >> BDRV_SECTOR_BITS,
+ false, &err) < 0) {
error_free(err);
ret = -EIO;
goto fail;
}
- qemu_iovec_from_buf(qiov, bytes_done,
- cluster_data, 512 * cur_nr_sectors);
+ qemu_iovec_from_buf(qiov, bytes_done, cluster_data, cur_bytes);
}
break;
@@ -1519,9 +1529,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
goto fail;
}
- remaining_sectors -= cur_nr_sectors;
- sector_num += cur_nr_sectors;
- bytes_done += cur_nr_sectors * 512;
+ bytes -= cur_bytes;
+ offset += cur_bytes;
+ bytes_done += cur_bytes;
}
ret = 0;
@@ -2435,7 +2445,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
if (head || tail) {
int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
uint64_t off;
- int nr;
+ unsigned int nr;
assert(head + count <= s->cluster_size);
@@ -2452,7 +2462,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
/* We can have new write after previous check */
offset = cl_start << BDRV_SECTOR_BITS;
count = s->cluster_size;
- nr = s->cluster_sectors;
+ nr = s->cluster_size;
ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
qemu_co_mutex_unlock(&s->lock);
@@ -3368,7 +3378,7 @@ BlockDriver bdrv_qcow2 = {
.bdrv_co_get_block_status = qcow2_co_get_block_status,
.bdrv_set_key = qcow2_set_key,
- .bdrv_co_readv = qcow2_co_readv,
+ .bdrv_co_preadv = qcow2_co_preadv,
.bdrv_co_writev = qcow2_co_writev,
.bdrv_co_flush_to_os = qcow2_co_flush_to_os,
diff --git a/block/qcow2.h b/block/qcow2.h
index 7db9795..e2c42d5 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -544,7 +544,7 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
int nb_sectors, bool enc, Error **errp);
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
- int *num, uint64_t *cluster_offset);
+ unsigned int *bytes, uint64_t *cluster_offset);
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *host_offset, QCowL2Meta **m);
uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
@ 2016-06-06 14:59 ` Kevin Wolf
2016-06-07 3:47 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
` (2 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
This will allow copy on write operations where the overwritten part of
the cluster is not aligned to sector boundaries.
Also rename the function because it has nothing to do with sectors any
more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 54 +++++++++++++++++++++++++--------------------------
1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 48a8e13..e84d6db 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -390,22 +390,18 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
return 0;
}
-static int coroutine_fn copy_sectors(BlockDriverState *bs,
- uint64_t start_sect,
- uint64_t cluster_offset,
- int n_start, int n_end)
+static int coroutine_fn do_perform_cow(BlockDriverState *bs,
+ uint64_t src_cluster_offset,
+ uint64_t cluster_offset,
+ int offset_in_cluster,
+ int bytes)
{
BDRVQcow2State *s = bs->opaque;
QEMUIOVector qiov;
struct iovec iov;
- int n, ret;
-
- n = n_end - n_start;
- if (n <= 0) {
- return 0;
- }
+ int ret;
- iov.iov_len = n * BDRV_SECTOR_SIZE;
+ iov.iov_len = bytes;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
@@ -424,18 +420,20 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
* interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled.
*/
- ret = bs->drv->bdrv_co_preadv(bs, (start_sect + n_start) * BDRV_SECTOR_SIZE,
- n * BDRV_SECTOR_SIZE, &qiov, 0);
+ ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
+ bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
if (bs->encrypted) {
Error *err = NULL;
+ int sector = (cluster_offset + offset_in_cluster) >> BDRV_SECTOR_BITS;
assert(s->cipher);
- if (qcow2_encrypt_sectors(s, start_sect + n_start,
- iov.iov_base, iov.iov_base, n,
- true, &err) < 0) {
+ assert((offset_in_cluster & BDRV_SECTOR_MASK) == 0);
+ assert((bytes & ~BDRV_SECTOR_MASK) == 0);
+ if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
+ bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
ret = -EIO;
error_free(err);
goto out;
@@ -443,14 +441,14 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
ret = qcow2_pre_write_overlap_check(bs, 0,
- cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
+ cluster_offset + offset_in_cluster, bytes);
if (ret < 0) {
goto out;
}
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
- ret = bdrv_co_writev(bs->file->bs, (cluster_offset >> 9) + n_start, n,
- &qiov);
+ ret = bdrv_co_pwritev(bs->file->bs, cluster_offset + offset_in_cluster,
+ bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
@@ -745,9 +743,8 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
}
qemu_co_mutex_unlock(&s->lock);
- ret = copy_sectors(bs, m->offset / BDRV_SECTOR_SIZE, m->alloc_offset,
- r->offset / BDRV_SECTOR_SIZE,
- r->offset / BDRV_SECTOR_SIZE + r->nb_sectors);
+ ret = do_perform_cow(bs, m->offset, m->alloc_offset,
+ r->offset, r->nb_sectors * BDRV_SECTOR_SIZE);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
@@ -809,13 +806,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster
- * each write allocates separate cluster and writes data concurrently.
- * The first one to complete updates l2 table with pointer to its
- * cluster the second one has to do RMW (which is done above by
- * copy_sectors()), update l2 table with its cluster pointer and free
- * old cluster. This is what this loop does */
- if(l2_table[l2_index + i] != 0)
+ * each write allocates separate cluster and writes data concurrently.
+ * The first one to complete updates l2 table with pointer to its
+ * cluster the second one has to do RMW (which is done above by
+ * perform_cow()), update l2 table with its cluster pointer and free
+ * old cluster. This is what this loop does */
+ if (l2_table[l2_index + i] != 0) {
old_cluster[j++] = l2_table[l2_index + i];
+ }
l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
(i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
` (2 preceding siblings ...)
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
@ 2016-06-06 14:59 ` Kevin Wolf
2016-06-07 3:50 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev() Kevin Wolf
2016-06-09 13:27 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
5 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
In preparation for implementing .bdrv_co_pwritev in qcow2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 32 ++++++++++++--------------------
block/qcow2.h | 13 +++----------
2 files changed, 15 insertions(+), 30 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e84d6db..0182675 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -738,13 +738,12 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
BDRVQcow2State *s = bs->opaque;
int ret;
- if (r->nb_sectors == 0) {
+ if (r->nb_bytes == 0) {
return 0;
}
qemu_co_mutex_unlock(&s->lock);
- ret = do_perform_cow(bs, m->offset, m->alloc_offset,
- r->offset, r->nb_sectors * BDRV_SECTOR_SIZE);
+ ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
@@ -1195,25 +1194,20 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
/*
* Save info needed for meta data update.
*
- * requested_sectors: Number of sectors from the start of the first
+ * requested_bytes: Number of bytes from the start of the first
* newly allocated cluster to the end of the (possibly shortened
* before) write request.
*
- * avail_sectors: Number of sectors from the start of the first
+ * avail_bytes: Number of bytes from the start of the first
* newly allocated to the end of the last newly allocated cluster.
*
- * nb_sectors: The number of sectors from the start of the first
+ * nb_bytes: The number of bytes from the start of the first
* newly allocated cluster to the end of the area that the write
* request actually writes to (excluding COW at the end)
*/
- int requested_sectors =
- (*bytes + offset_into_cluster(s, guest_offset))
- >> BDRV_SECTOR_BITS;
- int avail_sectors = nb_clusters
- << (s->cluster_bits - BDRV_SECTOR_BITS);
- int alloc_n_start = offset_into_cluster(s, guest_offset)
- >> BDRV_SECTOR_BITS;
- int nb_sectors = MIN(requested_sectors, avail_sectors);
+ uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
+ int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits);
+ int nb_bytes = MIN(requested_bytes, avail_bytes);
QCowL2Meta *old_m = *m;
*m = g_malloc0(sizeof(**m));
@@ -1224,23 +1218,21 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
.alloc_offset = alloc_cluster_offset,
.offset = start_of_cluster(s, guest_offset),
.nb_clusters = nb_clusters,
- .nb_available = nb_sectors,
.cow_start = {
.offset = 0,
- .nb_sectors = alloc_n_start,
+ .nb_bytes = offset_into_cluster(s, guest_offset),
},
.cow_end = {
- .offset = nb_sectors * BDRV_SECTOR_SIZE,
- .nb_sectors = avail_sectors - nb_sectors,
+ .offset = nb_bytes,
+ .nb_bytes = avail_bytes - nb_bytes,
},
};
qemu_co_queue_init(&(*m)->dependent_requests);
QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
*host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset);
- *bytes = MIN(*bytes, (nb_sectors * BDRV_SECTOR_SIZE)
- - offset_into_cluster(s, guest_offset));
+ *bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
assert(*bytes != 0);
return 1;
diff --git a/block/qcow2.h b/block/qcow2.h
index e2c42d5..175b0c1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -302,8 +302,8 @@ typedef struct Qcow2COWRegion {
*/
uint64_t offset;
- /** Number of sectors to copy */
- int nb_sectors;
+ /** Number of bytes to copy */
+ int nb_bytes;
} Qcow2COWRegion;
/**
@@ -318,12 +318,6 @@ typedef struct QCowL2Meta
/** Host offset of the first newly allocated cluster */
uint64_t alloc_offset;
- /**
- * Number of sectors from the start of the first allocated cluster to
- * the end of the (possibly shortened) request
- */
- int nb_available;
-
/** Number of newly allocated clusters */
int nb_clusters;
@@ -471,8 +465,7 @@ static inline uint64_t l2meta_cow_start(QCowL2Meta *m)
static inline uint64_t l2meta_cow_end(QCowL2Meta *m)
{
- return m->offset + m->cow_end.offset
- + (m->cow_end.nb_sectors << BDRV_SECTOR_BITS);
+ return m->offset + m->cow_end.offset + m->cow_end.nb_bytes;
}
static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev()
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
` (3 preceding siblings ...)
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
@ 2016-06-06 14:59 ` Kevin Wolf
2016-06-07 4:02 ` Eric Blake
2016-06-09 13:27 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
5 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:59 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
This changes qcow2 to implement the byte-based .bdrv_co_pwritev
interface rather than the sector-based old one.
As preallocation uses the same allocation function as normal writes, and
the interface of that function needs to be changed, it is converted in
the same patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 13 ++++----
block/qcow2.c | 89 ++++++++++++++++++++++++---------------------------
block/qcow2.h | 3 +-
trace-events | 6 ++--
4 files changed, 52 insertions(+), 59 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 0182675..b97eb8f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1264,7 +1264,8 @@ fail:
* Return 0 on success and -errno in error cases
*/
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
- int *num, uint64_t *host_offset, QCowL2Meta **m)
+ unsigned int *bytes, uint64_t *host_offset,
+ QCowL2Meta **m)
{
BDRVQcow2State *s = bs->opaque;
uint64_t start, remaining;
@@ -1272,13 +1273,11 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
uint64_t cur_bytes;
int ret;
- trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *num);
-
- assert((offset & ~BDRV_SECTOR_MASK) == 0);
+ trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *bytes);
again:
start = offset;
- remaining = (uint64_t)*num << BDRV_SECTOR_BITS;
+ remaining = *bytes;
cluster_offset = 0;
*host_offset = 0;
cur_bytes = 0;
@@ -1364,8 +1363,8 @@ again:
}
}
- *num -= remaining >> BDRV_SECTOR_BITS;
- assert(*num > 0);
+ *bytes -= remaining;
+ assert(*bytes > 0);
assert(*host_offset != 0);
return 0;
diff --git a/block/qcow2.c b/block/qcow2.c
index 545dea0..cb55e2d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1544,23 +1544,21 @@ fail:
return ret;
}
-static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
- int64_t sector_num,
- int remaining_sectors,
- QEMUIOVector *qiov)
+static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
+ uint64_t bytes, QEMUIOVector *qiov,
+ int flags)
{
BDRVQcow2State *s = bs->opaque;
- int index_in_cluster;
+ int offset_in_cluster;
int ret;
- int cur_nr_sectors; /* number of sectors in current iteration */
+ unsigned int cur_bytes; /* number of sectors in current iteration */
uint64_t cluster_offset;
QEMUIOVector hd_qiov;
uint64_t bytes_done = 0;
uint8_t *cluster_data = NULL;
QCowL2Meta *l2meta = NULL;
- trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
- remaining_sectors);
+ trace_qcow2_writev_start_req(qemu_coroutine_self(), offset, bytes);
qemu_iovec_init(&hd_qiov, qiov->niov);
@@ -1568,22 +1566,21 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_lock(&s->lock);
- while (remaining_sectors != 0) {
+ while (bytes != 0) {
l2meta = NULL;
trace_qcow2_writev_start_part(qemu_coroutine_self());
- index_in_cluster = sector_num & (s->cluster_sectors - 1);
- cur_nr_sectors = remaining_sectors;
- if (bs->encrypted &&
- cur_nr_sectors >
- QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster) {
- cur_nr_sectors =
- QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster;
+ offset_in_cluster = offset_into_cluster(s, offset);
+ cur_bytes = MIN(bytes, INT_MAX);
+ if (bs->encrypted) {
+ cur_bytes = MIN(cur_bytes,
+ QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+ - offset_in_cluster);
}
- ret = qcow2_alloc_cluster_offset(bs, sector_num << 9,
- &cur_nr_sectors, &cluster_offset, &l2meta);
+ ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
+ &cluster_offset, &l2meta);
if (ret < 0) {
goto fail;
}
@@ -1591,8 +1588,7 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
assert((cluster_offset & 511) == 0);
qemu_iovec_reset(&hd_qiov);
- qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
- cur_nr_sectors * 512);
+ qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
if (bs->encrypted) {
Error *err = NULL;
@@ -1611,8 +1607,9 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
- if (qcow2_encrypt_sectors(s, sector_num, cluster_data,
- cluster_data, cur_nr_sectors,
+ if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
+ cluster_data, cluster_data,
+ cur_bytes >>BDRV_SECTOR_BITS,
true, &err) < 0) {
error_free(err);
ret = -EIO;
@@ -1620,13 +1617,11 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
}
qemu_iovec_reset(&hd_qiov);
- qemu_iovec_add(&hd_qiov, cluster_data,
- cur_nr_sectors * 512);
+ qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
}
ret = qcow2_pre_write_overlap_check(bs, 0,
- cluster_offset + index_in_cluster * BDRV_SECTOR_SIZE,
- cur_nr_sectors * BDRV_SECTOR_SIZE);
+ cluster_offset + offset_in_cluster, cur_bytes);
if (ret < 0) {
goto fail;
}
@@ -1634,10 +1629,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_unlock(&s->lock);
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
trace_qcow2_writev_data(qemu_coroutine_self(),
- (cluster_offset >> 9) + index_in_cluster);
- ret = bdrv_co_writev(bs->file->bs,
- (cluster_offset >> 9) + index_in_cluster,
- cur_nr_sectors, &hd_qiov);
+ cluster_offset + offset_in_cluster);
+ ret = bdrv_co_pwritev(bs->file->bs,
+ cluster_offset + offset_in_cluster,
+ cur_bytes, &hd_qiov, 0);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
goto fail;
@@ -1663,10 +1658,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
l2meta = next;
}
- remaining_sectors -= cur_nr_sectors;
- sector_num += cur_nr_sectors;
- bytes_done += cur_nr_sectors * 512;
- trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_nr_sectors);
+ bytes -= cur_bytes;
+ offset += cur_bytes;
+ bytes_done += cur_bytes;
+ trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
}
ret = 0;
@@ -2008,19 +2003,19 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
static int preallocate(BlockDriverState *bs)
{
- uint64_t nb_sectors;
+ uint64_t bytes;
uint64_t offset;
uint64_t host_offset = 0;
- int num;
+ unsigned int cur_bytes;
int ret;
QCowL2Meta *meta;
- nb_sectors = bdrv_nb_sectors(bs);
+ bytes = bdrv_getlength(bs);
offset = 0;
- while (nb_sectors) {
- num = MIN(nb_sectors, INT_MAX >> BDRV_SECTOR_BITS);
- ret = qcow2_alloc_cluster_offset(bs, offset, &num,
+ while (bytes) {
+ cur_bytes = MIN(bytes, INT_MAX);
+ ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
&host_offset, &meta);
if (ret < 0) {
return ret;
@@ -2046,8 +2041,8 @@ static int preallocate(BlockDriverState *bs)
/* TODO Preallocate data if requested */
- nb_sectors -= num;
- offset += num << BDRV_SECTOR_BITS;
+ bytes -= cur_bytes;
+ offset += cur_bytes;
}
/*
@@ -2056,11 +2051,9 @@ static int preallocate(BlockDriverState *bs)
* EOF). Extend the image to the last allocated sector.
*/
if (host_offset != 0) {
- uint8_t buf[BDRV_SECTOR_SIZE];
- memset(buf, 0, BDRV_SECTOR_SIZE);
- ret = bdrv_write(bs->file->bs,
- (host_offset >> BDRV_SECTOR_BITS) + num - 1,
- buf, 1);
+ uint8_t data = 0;
+ ret = bdrv_pwrite(bs->file->bs, (host_offset + cur_bytes) - 1,
+ &data, 1);
if (ret < 0) {
return ret;
}
@@ -3379,7 +3372,7 @@ BlockDriver bdrv_qcow2 = {
.bdrv_set_key = qcow2_set_key,
.bdrv_co_preadv = qcow2_co_preadv,
- .bdrv_co_writev = qcow2_co_writev,
+ .bdrv_co_pwritev = qcow2_co_pwritev,
.bdrv_co_flush_to_os = qcow2_co_flush_to_os,
.bdrv_co_pwrite_zeroes = qcow2_co_pwrite_zeroes,
diff --git a/block/qcow2.h b/block/qcow2.h
index 175b0c1..b36a7bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -539,7 +539,8 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int *bytes, uint64_t *cluster_offset);
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
- int *num, uint64_t *host_offset, QCowL2Meta **m);
+ unsigned int *bytes, uint64_t *host_offset,
+ QCowL2Meta **m);
uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
uint64_t offset,
int compressed_size);
diff --git a/trace-events b/trace-events
index 7cc6774..78c74b5 100644
--- a/trace-events
+++ b/trace-events
@@ -606,16 +606,16 @@ qemu_system_shutdown_request(void) ""
qemu_system_powerdown_request(void) ""
# block/qcow2.c
-qcow2_writev_start_req(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
+qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
qcow2_writev_start_part(void *co) "co %p"
-qcow2_writev_done_part(void *co, int cur_nr_sectors) "co %p cur_nr_sectors %d"
+qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d"
qcow2_writev_data(void *co, uint64_t offset) "co %p offset %" PRIx64
qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
# block/qcow2-cluster.c
-qcow2_alloc_clusters_offset(void *co, uint64_t offset, int num) "co %p offset %" PRIx64 " num %d"
+qcow2_alloc_clusters_offset(void *co, uint64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
qcow2_handle_copied(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
qcow2_handle_alloc(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
qcow2_do_alloc_clusters_offset(void *co, uint64_t guest_offset, uint64_t host_offset, int nb_clusters) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " nb_clusters %d"
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset()
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
@ 2016-06-06 17:11 ` Eric Blake
0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-06 17:11 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 558 bytes --]
On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> This patch changes the units that qcow2_get_cluster_offset() uses
> internally, without touching the interface just yet. This will be done
> in another patch.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 44 +++++++++++++++++++++++---------------------
> 1 file changed, 23 insertions(+), 21 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv()
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
@ 2016-06-06 21:32 ` Eric Blake
0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-06 21:32 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 2701 bytes --]
On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> Reading from qcow2 images is now byte granularity.
>
> Most of the affected code in qcow2 actually gets simpler with this
> change. The only exception is encryption, which is fixed on 512 bytes
> blocks; in order to keep this working, bs->request_alignment is set for
> encrypted images.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 27 ++++++-------
> block/qcow2.c | 108 +++++++++++++++++++++++++++-----------------------
> block/qcow2.h | 2 +-
> 3 files changed, 72 insertions(+), 65 deletions(-)
>
> - * on exit, *num is the number of contiguous sectors we can read.
> + * On exit, *bytes is the number of bytes staring at offset that have the same
s/staring/starting/
(although "staring" does make for a rather interesting thought :)
> @@ -1389,26 +1402,24 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
>
> qemu_co_mutex_lock(&s->lock);
>
> - while (remaining_sectors != 0) {
> + while (bytes != 0) {
>
> /* prepare next request */
> - cur_nr_sectors = remaining_sectors;
> + cur_bytes = MIN(bytes, INT_MAX);
> if (s->cipher) {
> - cur_nr_sectors = MIN(cur_nr_sectors,
> - QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
> + cur_bytes = MIN(cur_bytes,
> + QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
> }
Can this ever result in cur_bytes that is not aligned, even though
encryption requires aligned reads? Do you need to round anything to a
sector boundary in this case?
> qemu_co_mutex_unlock(&s->lock);
> - ret = bdrv_co_readv(bs->file->bs,
> - (cluster_offset >> 9) + index_in_cluster,
> - cur_nr_sectors, &hd_qiov);
> + ret = bdrv_co_preadv(bs->file->bs,
> + cluster_offset + offset_in_cluster,
> + cur_bytes, &hd_qiov, 0);
> qemu_co_mutex_lock(&s->lock);
> if (ret < 0) {
> goto fail;
> }
> if (bs->encrypted) {
> assert(s->cipher);
> + assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> + assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
Hmm, you do check that later on.
So, other than the typo (easy to fix on commit),
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
@ 2016-06-07 3:47 ` Eric Blake
2016-06-07 9:12 ` Kevin Wolf
2016-06-07 9:20 ` [Qemu-devel] [PATCH v3 " Kevin Wolf
0 siblings, 2 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-07 3:47 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]
On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> This will allow copy on write operations where the overwritten part of
> the cluster is not aligned to sector boundaries.
>
> Also rename the function because it has nothing to do with sectors any
> more.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 54 +++++++++++++++++++++++++--------------------------
> 1 file changed, 26 insertions(+), 28 deletions(-)
>
>
> if (bs->encrypted) {
> Error *err = NULL;
> + int sector = (cluster_offset + offset_in_cluster) >> BDRV_SECTOR_BITS;
Potentially the wrong type...
> assert(s->cipher);
> - if (qcow2_encrypt_sectors(s, start_sect + n_start,
> - iov.iov_base, iov.iov_base, n,
> - true, &err) < 0) {
> + assert((offset_in_cluster & BDRV_SECTOR_MASK) == 0);
Why is this one true? If I have a cluster of 4 sectors, why must
offset_in_cluster fall within only the first of those sectors? Are you
missing a ~, to instead be asserting that offset_in_cluster is
sector-aligned?
> + assert((bytes & ~BDRV_SECTOR_MASK) == 0);
This one looks correct, stating that the number of bytes to copy is a
sector multiple.
> + if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
> + bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
...since encryption allows a 64-bit sector number for the case where the
image is larger than 2T.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
@ 2016-06-07 3:50 ` Eric Blake
0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-07 3:50 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 490 bytes --]
On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> In preparation for implementing .bdrv_co_pwritev in qcow2.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 32 ++++++++++++--------------------
> block/qcow2.h | 13 +++----------
> 2 files changed, 15 insertions(+), 30 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev()
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev() Kevin Wolf
@ 2016-06-07 4:02 ` Eric Blake
0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-07 4:02 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 803 bytes --]
On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> This changes qcow2 to implement the byte-based .bdrv_co_pwritev
> interface rather than the sector-based old one.
>
> As preallocation uses the same allocation function as normal writes, and
> the interface of that function needs to be changed, it is converted in
> the same patch.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 13 ++++----
> block/qcow2.c | 89 ++++++++++++++++++++++++---------------------------
> block/qcow2.h | 3 +-
> trace-events | 6 ++--
> 4 files changed, 52 insertions(+), 59 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based
2016-06-07 3:47 ` Eric Blake
@ 2016-06-07 9:12 ` Kevin Wolf
2016-06-07 9:20 ` [Qemu-devel] [PATCH v3 " Kevin Wolf
1 sibling, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-06-07 9:12 UTC (permalink / raw)
To: Eric Blake; +Cc: qemu-block, mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1920 bytes --]
Am 07.06.2016 um 05:47 hat Eric Blake geschrieben:
> On 06/06/2016 08:59 AM, Kevin Wolf wrote:
> > This will allow copy on write operations where the overwritten part of
> > the cluster is not aligned to sector boundaries.
> >
> > Also rename the function because it has nothing to do with sectors any
> > more.
> >
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> > block/qcow2-cluster.c | 54 +++++++++++++++++++++++++--------------------------
> > 1 file changed, 26 insertions(+), 28 deletions(-)
> >
>
> >
> > if (bs->encrypted) {
> > Error *err = NULL;
> > + int sector = (cluster_offset + offset_in_cluster) >> BDRV_SECTOR_BITS;
>
> Potentially the wrong type...
Yes, thanks for catching that.
> > assert(s->cipher);
> > - if (qcow2_encrypt_sectors(s, start_sect + n_start,
> > - iov.iov_base, iov.iov_base, n,
> > - true, &err) < 0) {
> > + assert((offset_in_cluster & BDRV_SECTOR_MASK) == 0);
>
> Why is this one true? If I have a cluster of 4 sectors, why must
> offset_in_cluster fall within only the first of those sectors? Are you
> missing a ~, to instead be asserting that offset_in_cluster is
> sector-aligned?
You mean I should actually test encrypted images? *cough* (I know I did
test something with encryption, but maybe that was only after converting
reads.)
Anyway, missing ~ indeed.
> > + assert((bytes & ~BDRV_SECTOR_MASK) == 0);
>
> This one looks correct, stating that the number of bytes to copy is a
> sector multiple.
>
> > + if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
> > + bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
>
> ...since encryption allows a 64-bit sector number for the case where the
> image is larger than 2T.
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v3 3/5] qcow2: Make copy_sectors() byte based
2016-06-07 3:47 ` Eric Blake
2016-06-07 9:12 ` Kevin Wolf
@ 2016-06-07 9:20 ` Kevin Wolf
2016-06-07 11:57 ` Eric Blake
1 sibling, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-06-07 9:20 UTC (permalink / raw)
To: qemu-block; +Cc: kwolf, mreitz, jsnow, eblake, qemu-devel
This will allow copy on write operations where the overwritten part of
the cluster is not aligned to sector boundaries.
Also rename the function because it has nothing to do with sectors any
more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 55 +++++++++++++++++++++++++--------------------------
1 file changed, 27 insertions(+), 28 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c5b739e..1e3f7f0 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -390,22 +390,18 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
return 0;
}
-static int coroutine_fn copy_sectors(BlockDriverState *bs,
- uint64_t start_sect,
- uint64_t cluster_offset,
- int n_start, int n_end)
+static int coroutine_fn do_perform_cow(BlockDriverState *bs,
+ uint64_t src_cluster_offset,
+ uint64_t cluster_offset,
+ int offset_in_cluster,
+ int bytes)
{
BDRVQcow2State *s = bs->opaque;
QEMUIOVector qiov;
struct iovec iov;
- int n, ret;
-
- n = n_end - n_start;
- if (n <= 0) {
- return 0;
- }
+ int ret;
- iov.iov_len = n * BDRV_SECTOR_SIZE;
+ iov.iov_len = bytes;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
@@ -424,18 +420,21 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
* interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled.
*/
- ret = bs->drv->bdrv_co_preadv(bs, (start_sect + n_start) * BDRV_SECTOR_SIZE,
- n * BDRV_SECTOR_SIZE, &qiov, 0);
+ ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
+ bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
if (bs->encrypted) {
Error *err = NULL;
+ int64_t sector = (cluster_offset + offset_in_cluster)
+ >> BDRV_SECTOR_BITS;
assert(s->cipher);
- if (qcow2_encrypt_sectors(s, start_sect + n_start,
- iov.iov_base, iov.iov_base, n,
- true, &err) < 0) {
+ assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
+ assert((bytes & ~BDRV_SECTOR_MASK) == 0);
+ if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
+ bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
ret = -EIO;
error_free(err);
goto out;
@@ -443,14 +442,14 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
ret = qcow2_pre_write_overlap_check(bs, 0,
- cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
+ cluster_offset + offset_in_cluster, bytes);
if (ret < 0) {
goto out;
}
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
- ret = bdrv_co_writev(bs->file->bs, (cluster_offset >> 9) + n_start, n,
- &qiov);
+ ret = bdrv_co_pwritev(bs->file->bs, cluster_offset + offset_in_cluster,
+ bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
@@ -745,9 +744,8 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
}
qemu_co_mutex_unlock(&s->lock);
- ret = copy_sectors(bs, m->offset / BDRV_SECTOR_SIZE, m->alloc_offset,
- r->offset / BDRV_SECTOR_SIZE,
- r->offset / BDRV_SECTOR_SIZE + r->nb_sectors);
+ ret = do_perform_cow(bs, m->offset, m->alloc_offset,
+ r->offset, r->nb_sectors * BDRV_SECTOR_SIZE);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
@@ -809,13 +807,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster
- * each write allocates separate cluster and writes data concurrently.
- * The first one to complete updates l2 table with pointer to its
- * cluster the second one has to do RMW (which is done above by
- * copy_sectors()), update l2 table with its cluster pointer and free
- * old cluster. This is what this loop does */
- if(l2_table[l2_index + i] != 0)
+ * each write allocates separate cluster and writes data concurrently.
+ * The first one to complete updates l2 table with pointer to its
+ * cluster the second one has to do RMW (which is done above by
+ * perform_cow()), update l2 table with its cluster pointer and free
+ * old cluster. This is what this loop does */
+ if (l2_table[l2_index + i] != 0) {
old_cluster[j++] = l2_table[l2_index + i];
+ }
l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
(i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/5] qcow2: Make copy_sectors() byte based
2016-06-07 9:20 ` [Qemu-devel] [PATCH v3 " Kevin Wolf
@ 2016-06-07 11:57 ` Eric Blake
0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-06-07 11:57 UTC (permalink / raw)
To: Kevin Wolf, qemu-block; +Cc: mreitz, jsnow, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 614 bytes --]
On 06/07/2016 03:20 AM, Kevin Wolf wrote:
> This will allow copy on write operations where the overwritten part of
> the cluster is not aligned to sector boundaries.
>
> Also rename the function because it has nothing to do with sectors any
> more.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block/qcow2-cluster.c | 55 +++++++++++++++++++++++++--------------------------
> 1 file changed, 27 insertions(+), 28 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
` (4 preceding siblings ...)
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev() Kevin Wolf
@ 2016-06-09 13:27 ` Kevin Wolf
5 siblings, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-06-09 13:27 UTC (permalink / raw)
To: qemu-block; +Cc: mreitz, jsnow, eblake, qemu-devel
Am 06.06.2016 um 16:59 hat Kevin Wolf geschrieben:
> This series converts qcow2 to the byte-based I/O interfaces. This simplifies
> the code by removing many unit conversions, and in the unlikely case of actual
> unaligned requests, it even makes the driver work more efficiently by avoiding
> read-modify-write.
Applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-06-09 13:27 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-06 14:59 [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
2016-06-06 17:11 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
2016-06-06 21:32 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
2016-06-07 3:47 ` Eric Blake
2016-06-07 9:12 ` Kevin Wolf
2016-06-07 9:20 ` [Qemu-devel] [PATCH v3 " Kevin Wolf
2016-06-07 11:57 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
2016-06-07 3:50 ` Eric Blake
2016-06-06 14:59 ` [Qemu-devel] [PATCH v2 5/5] qcow2: Implement .bdrv_co_pwritev() Kevin Wolf
2016-06-07 4:02 ` Eric Blake
2016-06-09 13:27 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).