* [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-10 13:04 ` Ashijeet Acharya
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 2/6] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
` (4 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Move the existing vmdk_find_offset_in_cluster() function to the top of
the driver. Also, introduce a new helper function size_to_clusters()
which returns the number of clusters for a given size in bytes. Here,
we leave the last cluster as we need to perform COW for that one.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index a9bd22b..22be887 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -242,6 +242,18 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
s->extents = g_renew(VmdkExtent, s->extents, s->num_extents);
}
+static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
+ int64_t offset)
+{
+ uint64_t extent_begin_offset, extent_relative_offset;
+ uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+
+ extent_begin_offset =
+ (extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
+ extent_relative_offset = offset - extent_begin_offset;
+ return extent_relative_offset % cluster_size;
+}
+
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
{
char *desc;
@@ -1266,18 +1278,6 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
return NULL;
}
-static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
- int64_t offset)
-{
- uint64_t extent_begin_offset, extent_relative_offset;
- uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
-
- extent_begin_offset =
- (extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
- extent_relative_offset = offset - extent_begin_offset;
- return extent_relative_offset % cluster_size;
-}
-
static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
int64_t sector_num)
{
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
@ 2017-04-10 13:04 ` Ashijeet Acharya
2017-04-19 12:14 ` Fam Zheng
0 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-10 13:04 UTC (permalink / raw)
To: Fam Zheng
Cc: Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
QEMU Developers, qemu block, Ashijeet Acharya
On Sat, Apr 1, 2017 at 8:14 PM, Ashijeet Acharya
<ashijeetacharya@gmail.com> wrote:
> Move the existing vmdk_find_offset_in_cluster() function to the top of
> the driver. Also, introduce a new helper function size_to_clusters()
> which returns the number of clusters for a given size in bytes. Here,
> we leave the last cluster as we need to perform COW for that one.
>
I will remove the trailing part of the commit message in v4 as there
is no size_to_clusters() in this patch anymore, I forgot to update it!
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top
2017-04-10 13:04 ` Ashijeet Acharya
@ 2017-04-19 12:14 ` Fam Zheng
0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-04-19 12:14 UTC (permalink / raw)
To: Ashijeet Acharya
Cc: Kevin Wolf, qemu block, Stefan Hajnoczi, QEMU Developers,
Max Reitz, John Snow
On Mon, 04/10 18:34, Ashijeet Acharya wrote:
> On Sat, Apr 1, 2017 at 8:14 PM, Ashijeet Acharya
> <ashijeetacharya@gmail.com> wrote:
> > Move the existing vmdk_find_offset_in_cluster() function to the top of
> > the driver. Also, introduce a new helper function size_to_clusters()
> > which returns the number of clusters for a given size in bytes. Here,
> > we leave the last cluster as we need to perform COW for that one.
> >
> I will remove the trailing part of the commit message in v4 as there
> is no size_to_clusters() in this patch anymore, I forgot to update it!
With that updated, you can add my:
Reviewed-by: Fam Zheng <famz@redhat.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH v3 2/6] vmdk: Rename get_whole_cluster() to vmdk_perform_cow()
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-19 12:14 ` Fam Zheng
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
` (3 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Rename the existing function get_whole_cluster() to vmdk_perform_cow()
as its sole purpose is to perform COW for the first and the last
allocated clusters if needed.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index 22be887..73ae786 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1028,8 +1028,8 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
-/**
- * get_whole_cluster
+/*
+ * vmdk_perform_cow
*
* Copy backing file's cluster that covers @sector_num, otherwise write zero,
* to the cluster at @cluster_sector_num.
@@ -1037,13 +1037,18 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
* If @skip_start_sector < @skip_end_sector, the relative range
* [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
* it for call to write user data in the request.
+ *
+ * Returns:
+ * VMDK_OK: on success
+ *
+ * VMDK_ERROR: in error cases
*/
-static int get_whole_cluster(BlockDriverState *bs,
- VmdkExtent *extent,
- uint64_t cluster_offset,
- uint64_t offset,
- uint64_t skip_start_bytes,
- uint64_t skip_end_bytes)
+static int vmdk_perform_cow(BlockDriverState *bs,
+ VmdkExtent *extent,
+ uint64_t cluster_offset,
+ uint64_t offset,
+ uint64_t skip_start_bytes,
+ uint64_t skip_end_bytes)
{
int ret = VMDK_OK;
int64_t cluster_bytes;
@@ -1244,7 +1249,7 @@ static int get_cluster_offset(BlockDriverState *bs,
* This problem may occur because of insufficient space on host disk
* or inappropriate VM shutdown.
*/
- ret = get_whole_cluster(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+ ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
offset, skip_start_bytes, skip_end_bytes);
if (ret) {
return ret;
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] vmdk: Rename get_whole_cluster() to vmdk_perform_cow()
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 2/6] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
@ 2017-04-19 12:14 ` Fam Zheng
0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-04-19 12:14 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block
On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> Rename the existing function get_whole_cluster() to vmdk_perform_cow()
> as its sole purpose is to perform COW for the first and the last
> allocated clusters if needed.
>
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> ---
> block/vmdk.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 22be887..73ae786 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1028,8 +1028,8 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
> }
> }
>
> -/**
> - * get_whole_cluster
> +/*
> + * vmdk_perform_cow
> *
> * Copy backing file's cluster that covers @sector_num, otherwise write zero,
> * to the cluster at @cluster_sector_num.
> @@ -1037,13 +1037,18 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
> * If @skip_start_sector < @skip_end_sector, the relative range
> * [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
> * it for call to write user data in the request.
> + *
> + * Returns:
> + * VMDK_OK: on success
> + *
> + * VMDK_ERROR: in error cases
> */
> -static int get_whole_cluster(BlockDriverState *bs,
> - VmdkExtent *extent,
> - uint64_t cluster_offset,
> - uint64_t offset,
> - uint64_t skip_start_bytes,
> - uint64_t skip_end_bytes)
> +static int vmdk_perform_cow(BlockDriverState *bs,
> + VmdkExtent *extent,
> + uint64_t cluster_offset,
> + uint64_t offset,
> + uint64_t skip_start_bytes,
> + uint64_t skip_end_bytes)
> {
> int ret = VMDK_OK;
> int64_t cluster_bytes;
> @@ -1244,7 +1249,7 @@ static int get_cluster_offset(BlockDriverState *bs,
> * This problem may occur because of insufficient space on host disk
> * or inappropriate VM shutdown.
> */
> - ret = get_whole_cluster(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
> + ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
> offset, skip_start_bytes, skip_end_bytes);
> if (ret) {
> return ret;
> --
> 2.6.2
>
Reviewed-by: Fam Zheng <famz@redhat.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 1/6] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 2/6] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-19 12:56 ` Fam Zheng
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
` (2 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Move the cluster tables loading code out of the existing
get_cluster_offset() function to avoid code duplication and implement it
in separate get_cluster_table() and vmdk_L2load() functions.
Introduce two new helper functions handle_alloc() and
vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple
clusters at once starting from a given offset on disk and performs COW
if necessary for first and last allocated clusters.
vmdk_alloc_cluster_offset() helps to return the offset of the first of
the many newly allocated clusters. Also, provide proper documentation
for both.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 337 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 308 insertions(+), 29 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index 73ae786..e5a289d 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -136,6 +136,7 @@ typedef struct VmdkMetaData {
unsigned int l2_offset;
int valid;
uint32_t *l2_cache_entry;
+ uint32_t nb_clusters;
} VmdkMetaData;
typedef struct VmdkGrainMarker {
@@ -254,6 +255,14 @@ static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
return extent_relative_offset % cluster_size;
}
+static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size)
+{
+ uint64_t cluster_size, round_off_size;
+ cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+ round_off_size = cluster_size - (size % cluster_size);
+ return DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1;
+}
+
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
{
char *desc;
@@ -1028,6 +1037,133 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
+static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
+ uint32_t offset)
+{
+ offset = cpu_to_le32(offset);
+ /* update L2 table */
+ if (bdrv_pwrite_sync(extent->file,
+ ((int64_t)m_data->l2_offset * 512)
+ + (m_data->l2_index * sizeof(offset)),
+ &offset, sizeof(offset)) < 0) {
+ return VMDK_ERROR;
+ }
+ /* update backup L2 table */
+ if (extent->l1_backup_table_offset != 0) {
+ m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
+ if (bdrv_pwrite_sync(extent->file,
+ ((int64_t)m_data->l2_offset * 512)
+ + (m_data->l2_index * sizeof(offset)),
+ &offset, sizeof(offset)) < 0) {
+ return VMDK_ERROR;
+ }
+ }
+ if (m_data->l2_cache_entry) {
+ *m_data->l2_cache_entry = offset;
+ }
+
+ return VMDK_OK;
+}
+
+/*
+ * vmdk_l2load
+ *
+ * Loads a new L2 table into memory. If the table is in the cache, the cache
+ * is used; otherwise the L2 table is loaded from the image file.
+ *
+ * Returns:
+ * VMDK_OK: on success
+ * VMDK_ERROR: in error cases
+ */
+static int vmdk_l2load(VmdkExtent *extent, uint64_t offset, int l2_offset,
+ uint32_t **new_l2_table, int *new_l2_index)
+{
+ int min_index, i, j;
+ uint32_t *l2_table;
+ uint32_t min_count;
+
+ for (i = 0; i < L2_CACHE_SIZE; i++) {
+ if (l2_offset == extent->l2_cache_offsets[i]) {
+ /* increment the hit count */
+ if (++extent->l2_cache_counts[i] == UINT32_MAX) {
+ for (j = 0; j < L2_CACHE_SIZE; j++) {
+ extent->l2_cache_counts[j] >>= 1;
+ }
+ }
+ l2_table = extent->l2_cache + (i * extent->l2_size);
+ goto found;
+ }
+ }
+ /* not found: load a new entry in the least used one */
+ min_index = 0;
+ min_count = UINT32_MAX;
+ for (i = 0; i < L2_CACHE_SIZE; i++) {
+ if (extent->l2_cache_counts[i] < min_count) {
+ min_count = extent->l2_cache_counts[i];
+ min_index = i;
+ }
+ }
+ l2_table = extent->l2_cache + (min_index * extent->l2_size);
+ if (bdrv_pread(extent->file,
+ (int64_t)l2_offset * 512,
+ l2_table,
+ extent->l2_size * sizeof(uint32_t)
+ ) != extent->l2_size * sizeof(uint32_t)) {
+ return VMDK_ERROR;
+ }
+
+ extent->l2_cache_offsets[min_index] = l2_offset;
+ extent->l2_cache_counts[min_index] = 1;
+found:
+ *new_l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
+ *new_l2_table = l2_table;
+
+ return VMDK_OK;
+}
+
+/*
+ * get_cluster_table
+ *
+ * for a given offset, load (and allocate if needed) the l2 table.
+ *
+ * Returns:
+ * VMDK_OK: on success
+ *
+ * VMDK_UNALLOC: if cluster is not mapped
+ *
+ * VMDK_ERROR: in error cases
+ */
+static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
+ int *new_l1_index, int *new_l2_offset,
+ int *new_l2_index, uint32_t **new_l2_table)
+{
+ int l1_index, l2_offset, l2_index;
+ uint32_t *l2_table;
+ int ret;
+
+ offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
+ l1_index = (offset >> 9) / extent->l1_entry_sectors;
+ if (l1_index >= extent->l1_size) {
+ return VMDK_ERROR;
+ }
+ l2_offset = extent->l1_table[l1_index];
+ if (!l2_offset) {
+ return VMDK_UNALLOC;
+ }
+
+ ret = vmdk_l2load(extent, offset, l2_offset, &l2_table, &l2_index);
+ if (ret < 0) {
+ return ret;
+ }
+
+ *new_l1_index = l1_index;
+ *new_l2_offset = l2_offset;
+ *new_l2_index = l2_index;
+ *new_l2_table = l2_table;
+
+ return VMDK_OK;
+}
+
/*
* vmdk_perform_cow
*
@@ -1115,29 +1251,168 @@ exit:
return ret;
}
-static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
- uint32_t offset)
+/*
+ * handle_alloc
+ *
+ * Allocates new clusters for an area that either is yet unallocated or needs a
+ * copy on write. If *cluster_offset is non_zero, clusters are only allocated if
+ * the new allocation can match the specified host offset.
+ *
+ * Returns:
+ * VMDK_OK: if new clusters were allocated, *bytes may be decreased if
+ * the new allocation doesn't cover all of the requested area.
+ * *cluster_offset is updated to contain the offset of the
+ * first newly allocated cluster.
+ *
+ * VMDK_UNALLOC: if no clusters could be allocated. *cluster_offset is left
+ * unchanged.
+ *
+ * VMDK_ERROR: in error cases
+ */
+static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
+ uint64_t offset, uint64_t *cluster_offset,
+ int64_t *bytes, VmdkMetaData *m_data,
+ bool allocate, uint32_t *total_alloc_clusters)
{
- offset = cpu_to_le32(offset);
- /* update L2 table */
- if (bdrv_pwrite_sync(extent->file,
- ((int64_t)m_data->l2_offset * 512)
- + (m_data->l2_index * sizeof(offset)),
- &offset, sizeof(offset)) < 0) {
- return VMDK_ERROR;
+ int l1_index, l2_offset, l2_index;
+ uint32_t *l2_table;
+ uint32_t cluster_sector;
+ uint32_t nb_clusters;
+ bool zeroed = false;
+ uint64_t skip_start_bytes, skip_end_bytes;
+ int ret;
+
+ ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
+ &l2_index, &l2_table);
+ if (ret < 0) {
+ return ret;
}
- /* update backup L2 table */
- if (extent->l1_backup_table_offset != 0) {
- m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
- if (bdrv_pwrite_sync(extent->file,
- ((int64_t)m_data->l2_offset * 512)
- + (m_data->l2_index * sizeof(offset)),
- &offset, sizeof(offset)) < 0) {
- return VMDK_ERROR;
+
+ cluster_sector = le32_to_cpu(l2_table[l2_index]);
+
+ skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
+ /* Calculate the number of clusters to look for. Here it will return one
+ * cluster less than the actual value calculated as we may need to perfrom
+ * COW for the last one. */
+ nb_clusters = size_to_clusters(extent, skip_start_bytes + *bytes);
+
+ nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
+ assert(nb_clusters <= INT_MAX);
+
+ /* update bytes according to final nb_clusters value */
+ if (nb_clusters != 0) {
+ *bytes = ((nb_clusters * extent->cluster_sectors) << 9)
+ - skip_start_bytes;
+ } else {
+ nb_clusters = 1;
+ }
+ *total_alloc_clusters += nb_clusters;
+ skip_end_bytes = skip_start_bytes + MIN(*bytes,
+ extent->cluster_sectors * BDRV_SECTOR_SIZE
+ - skip_start_bytes);
+
+ if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
+ zeroed = true;
+ }
+
+ if (!cluster_sector || zeroed) {
+ if (!allocate) {
+ return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
+ }
+
+ cluster_sector = extent->next_cluster_sector;
+ extent->next_cluster_sector += extent->cluster_sectors
+ * nb_clusters;
+
+ ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+ offset, skip_start_bytes,
+ skip_end_bytes);
+ if (ret < 0) {
+ return ret;
+ }
+ if (m_data) {
+ m_data->valid = 1;
+ m_data->l1_index = l1_index;
+ m_data->l2_index = l2_index;
+ m_data->l2_offset = l2_offset;
+ m_data->l2_cache_entry = &l2_table[l2_index];
+ m_data->nb_clusters = nb_clusters;
}
}
- if (m_data->l2_cache_entry) {
- *m_data->l2_cache_entry = offset;
+ *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
+ return VMDK_OK;
+}
+
+/*
+ * vmdk_alloc_cluster_offset
+ *
+ * For a given offset on the virtual disk, find the cluster offset in vmdk
+ * file. If the offset is not found, allocate a new cluster.
+ *
+ * If the cluster is newly allocated, m_data->nb_clusters is set to the number
+ * of contiguous clusters that have been allocated. In this case, the other
+ * fields of m_data are valid and contain information about the first allocated
+ * cluster.
+ *
+ * Returns:
+ *
+ * VMDK_OK: on success and @cluster_offset was set
+ *
+ * VMDK_UNALLOC: if no clusters were allocated and @cluster_offset is
+ * set to zero
+ *
+ * VMDK_ERROR: in error cases
+ */
+static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
+ VmdkExtent *extent,
+ VmdkMetaData *m_data, uint64_t offset,
+ bool allocate, uint64_t *cluster_offset,
+ int64_t bytes,
+ uint32_t *total_alloc_clusters)
+{
+ uint64_t start, remaining;
+ uint64_t new_cluster_offset;
+ int64_t n_bytes;
+ int ret;
+
+ if (extent->flat) {
+ *cluster_offset = extent->flat_start_offset;
+ return VMDK_OK;
+ }
+
+ start = offset;
+ remaining = bytes;
+ new_cluster_offset = 0;
+ *cluster_offset = 0;
+ n_bytes = 0;
+ if (m_data) {
+ m_data->valid = 0;
+ }
+
+ /* due to L2 table margins all bytes may not get allocated at once */
+ while (true) {
+
+ if (!*cluster_offset) {
+ *cluster_offset = new_cluster_offset;
+ }
+
+ start += n_bytes;
+ remaining -= n_bytes;
+ new_cluster_offset += n_bytes;
+
+ if (remaining == 0) {
+ break;
+ }
+
+ n_bytes = remaining;
+
+ ret = handle_alloc(bs, extent, start, &new_cluster_offset, &n_bytes,
+ m_data, allocate, total_alloc_clusters);
+
+ if (ret < 0) {
+ return ret;
+
+ }
}
return VMDK_OK;
@@ -1567,6 +1842,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t cluster_offset;
uint64_t bytes_done = 0;
VmdkMetaData m_data;
+ uint32_t total_alloc_clusters = 0;
if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
error_report("Wrong offset: offset=0x%" PRIx64
@@ -1584,10 +1860,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
- offset_in_cluster);
- ret = get_cluster_offset(bs, extent, &m_data, offset,
- !(extent->compressed || zeroed),
- &cluster_offset, offset_in_cluster,
- offset_in_cluster + n_bytes);
+ ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
+ !(extent->compressed || zeroed),
+ &cluster_offset, n_bytes,
+ &total_alloc_clusters);
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
@@ -1596,19 +1872,22 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
return -EIO;
} else {
/* allocate */
- ret = get_cluster_offset(bs, extent, &m_data, offset,
- true, &cluster_offset, 0, 0);
+ ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
+ true, &cluster_offset, n_bytes,
+ &total_alloc_clusters);
}
}
if (ret == VMDK_ERROR) {
return -EINVAL;
}
+
if (zeroed) {
/* Do zeroed write, buf is ignored */
- if (extent->has_zero_grain &&
- offset_in_cluster == 0 &&
- n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
- n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+ if (extent->has_zero_grain && offset_in_cluster == 0 &&
+ n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE *
+ total_alloc_clusters) {
+ n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE *
+ total_alloc_clusters;
if (!zero_dry_run) {
/* update L2 tables */
if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
@ 2017-04-19 12:56 ` Fam Zheng
2017-04-19 15:13 ` Ashijeet Acharya
0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2017-04-19 12:56 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block
On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> Move the cluster tables loading code out of the existing
> get_cluster_offset() function to avoid code duplication and implement it
> in separate get_cluster_table() and vmdk_L2load() functions.
>
> Introduce two new helper functions handle_alloc() and
> vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple
> clusters at once starting from a given offset on disk and performs COW
> if necessary for first and last allocated clusters.
> vmdk_alloc_cluster_offset() helps to return the offset of the first of
> the many newly allocated clusters. Also, provide proper documentation
> for both.
>
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> ---
> block/vmdk.c | 337 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 308 insertions(+), 29 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 73ae786..e5a289d 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -136,6 +136,7 @@ typedef struct VmdkMetaData {
> unsigned int l2_offset;
> int valid;
> uint32_t *l2_cache_entry;
> + uint32_t nb_clusters;
> } VmdkMetaData;
>
> typedef struct VmdkGrainMarker {
> @@ -254,6 +255,14 @@ static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
> return extent_relative_offset % cluster_size;
> }
>
> +static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size)
> +{
> + uint64_t cluster_size, round_off_size;
> + cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
> + round_off_size = cluster_size - (size % cluster_size);
> + return DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1;
What is (BDRV_SECTOR_SIZE * 128)? Do you mean extent->cluster_size? And the
function doesn't make sense up to me.
Just un-inline this to
DIV_ROUND_UP(size,
extent->cluster_sectors << BDRV_SECTOR_BITS) - 1
in the calling site and be done with it.
> +}
> +
> static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
> {
> char *desc;
> @@ -1028,6 +1037,133 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
> }
> }
>
> +static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> + uint32_t offset)
> +{
> + offset = cpu_to_le32(offset);
> + /* update L2 table */
> + if (bdrv_pwrite_sync(extent->file,
> + ((int64_t)m_data->l2_offset * 512)
> + + (m_data->l2_index * sizeof(offset)),
> + &offset, sizeof(offset)) < 0) {
> + return VMDK_ERROR;
> + }
> + /* update backup L2 table */
> + if (extent->l1_backup_table_offset != 0) {
> + m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
> + if (bdrv_pwrite_sync(extent->file,
> + ((int64_t)m_data->l2_offset * 512)
> + + (m_data->l2_index * sizeof(offset)),
> + &offset, sizeof(offset)) < 0) {
> + return VMDK_ERROR;
> + }
> + }
> + if (m_data->l2_cache_entry) {
> + *m_data->l2_cache_entry = offset;
> + }
> +
> + return VMDK_OK;
> +}
> +
> +/*
> + * vmdk_l2load
> + *
> + * Loads a new L2 table into memory. If the table is in the cache, the cache
Not a native speaker, but s/Loads/Load/ feels more nature and consistent with
other comments.
> + * is used; otherwise the L2 table is loaded from the image file.
> + *
> + * Returns:
> + * VMDK_OK: on success
> + * VMDK_ERROR: in error cases
> + */
> +static int vmdk_l2load(VmdkExtent *extent, uint64_t offset, int l2_offset,
> + uint32_t **new_l2_table, int *new_l2_index)
> +{
> + int min_index, i, j;
> + uint32_t *l2_table;
> + uint32_t min_count;
> +
> + for (i = 0; i < L2_CACHE_SIZE; i++) {
> + if (l2_offset == extent->l2_cache_offsets[i]) {
> + /* increment the hit count */
> + if (++extent->l2_cache_counts[i] == UINT32_MAX) {
> + for (j = 0; j < L2_CACHE_SIZE; j++) {
> + extent->l2_cache_counts[j] >>= 1;
> + }
> + }
> + l2_table = extent->l2_cache + (i * extent->l2_size);
> + goto found;
> + }
> + }
> + /* not found: load a new entry in the least used one */
> + min_index = 0;
> + min_count = UINT32_MAX;
> + for (i = 0; i < L2_CACHE_SIZE; i++) {
> + if (extent->l2_cache_counts[i] < min_count) {
> + min_count = extent->l2_cache_counts[i];
> + min_index = i;
> + }
> + }
> + l2_table = extent->l2_cache + (min_index * extent->l2_size);
> + if (bdrv_pread(extent->file,
> + (int64_t)l2_offset * 512,
> + l2_table,
> + extent->l2_size * sizeof(uint32_t)
> + ) != extent->l2_size * sizeof(uint32_t)) {
> + return VMDK_ERROR;
> + }
> +
> + extent->l2_cache_offsets[min_index] = l2_offset;
> + extent->l2_cache_counts[min_index] = 1;
> +found:
> + *new_l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
> + *new_l2_table = l2_table;
> +
> + return VMDK_OK;
> +}
> +
> +/*
> + * get_cluster_table
> + *
> + * for a given offset, load (and allocate if needed) the l2 table.
> + *
> + * Returns:
> + * VMDK_OK: on success
> + *
> + * VMDK_UNALLOC: if cluster is not mapped
> + *
> + * VMDK_ERROR: in error cases
> + */
> +static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
> + int *new_l1_index, int *new_l2_offset,
> + int *new_l2_index, uint32_t **new_l2_table)
> +{
> + int l1_index, l2_offset, l2_index;
> + uint32_t *l2_table;
> + int ret;
> +
> + offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
> + l1_index = (offset >> 9) / extent->l1_entry_sectors;
> + if (l1_index >= extent->l1_size) {
> + return VMDK_ERROR;
> + }
> + l2_offset = extent->l1_table[l1_index];
> + if (!l2_offset) {
> + return VMDK_UNALLOC;
> + }
> +
> + ret = vmdk_l2load(extent, offset, l2_offset, &l2_table, &l2_index);
> + if (ret < 0) {
> + return ret;
> + }
> +
> + *new_l1_index = l1_index;
> + *new_l2_offset = l2_offset;
> + *new_l2_index = l2_index;
> + *new_l2_table = l2_table;
> +
> + return VMDK_OK;
> +}
> +
Can you move this hunk into patch 4 and put it before this patch? It will make
reviewing a bit easier. (Yes, this patch is already big.)
> /*
> * vmdk_perform_cow
> *
> @@ -1115,29 +1251,168 @@ exit:
> return ret;
> }
>
> -static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> - uint32_t offset)
> +/*
> + * handle_alloc
> + *
> + * Allocates new clusters for an area that either is yet unallocated or needs a
Similar to vmdk_l2load, s/Allocates/Allocate/
> + * copy on write. If *cluster_offset is non_zero, clusters are only allocated if
> + * the new allocation can match the specified host offset.
> + *
> + * Returns:
> + * VMDK_OK: if new clusters were allocated, *bytes may be decreased if
> + * the new allocation doesn't cover all of the requested area.
> + * *cluster_offset is updated to contain the offset of the
> + * first newly allocated cluster.
> + *
> + * VMDK_UNALLOC: if no clusters could be allocated. *cluster_offset is left
> + * unchanged.
> + *
> + * VMDK_ERROR: in error cases
> + */
> +static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> + uint64_t offset, uint64_t *cluster_offset,
> + int64_t *bytes, VmdkMetaData *m_data,
> + bool allocate, uint32_t *total_alloc_clusters)
> {
> - offset = cpu_to_le32(offset);
> - /* update L2 table */
> - if (bdrv_pwrite_sync(extent->file,
> - ((int64_t)m_data->l2_offset * 512)
> - + (m_data->l2_index * sizeof(offset)),
> - &offset, sizeof(offset)) < 0) {
> - return VMDK_ERROR;
> + int l1_index, l2_offset, l2_index;
> + uint32_t *l2_table;
> + uint32_t cluster_sector;
> + uint32_t nb_clusters;
> + bool zeroed = false;
> + uint64_t skip_start_bytes, skip_end_bytes;
> + int ret;
> +
> + ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
> + &l2_index, &l2_table);
> + if (ret < 0) {
> + return ret;
> }
> - /* update backup L2 table */
> - if (extent->l1_backup_table_offset != 0) {
> - m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
> - if (bdrv_pwrite_sync(extent->file,
> - ((int64_t)m_data->l2_offset * 512)
> - + (m_data->l2_index * sizeof(offset)),
> - &offset, sizeof(offset)) < 0) {
> - return VMDK_ERROR;
> +
> + cluster_sector = le32_to_cpu(l2_table[l2_index]);
> +
> + skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
> + /* Calculate the number of clusters to look for. Here it will return one
> + * cluster less than the actual value calculated as we may need to perfrom
> + * COW for the last one. */
> + nb_clusters = size_to_clusters(extent, skip_start_bytes + *bytes);
> +
> + nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
> + assert(nb_clusters <= INT_MAX);
> +
> + /* update bytes according to final nb_clusters value */
> + if (nb_clusters != 0) {
> + *bytes = ((nb_clusters * extent->cluster_sectors) << 9)
> + - skip_start_bytes;
> + } else {
> + nb_clusters = 1;
> + }
> + *total_alloc_clusters += nb_clusters;
> + skip_end_bytes = skip_start_bytes + MIN(*bytes,
> + extent->cluster_sectors * BDRV_SECTOR_SIZE
> + - skip_start_bytes);
> +
> + if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
> + zeroed = true;
> + }
> +
> + if (!cluster_sector || zeroed) {
> + if (!allocate) {
> + return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
> + }
> +
> + cluster_sector = extent->next_cluster_sector;
> + extent->next_cluster_sector += extent->cluster_sectors
> + * nb_clusters;
> +
> + ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
> + offset, skip_start_bytes,
> + skip_end_bytes);
> + if (ret < 0) {
> + return ret;
> + }
> + if (m_data) {
> + m_data->valid = 1;
> + m_data->l1_index = l1_index;
> + m_data->l2_index = l2_index;
> + m_data->l2_offset = l2_offset;
> + m_data->l2_cache_entry = &l2_table[l2_index];
> + m_data->nb_clusters = nb_clusters;
> }
> }
> - if (m_data->l2_cache_entry) {
> - *m_data->l2_cache_entry = offset;
> + *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
> + return VMDK_OK;
> +}
> +
> +/*
> + * vmdk_alloc_cluster_offset
Maybe just name it "vmdk_alloc_clusters", which sounds better to me? Because the
clusters are what we allocate here, it's rather
"vmdk_alloc_clusters_and_get_offset" but we probably don't want it that long.
> + *
> + * For a given offset on the virtual disk, find the cluster offset in vmdk
> + * file. If the offset is not found, allocate a new cluster.
> + *
> + * If the cluster is newly allocated, m_data->nb_clusters is set to the number
> + * of contiguous clusters that have been allocated. In this case, the other
> + * fields of m_data are valid and contain information about the first allocated
> + * cluster.
> + *
> + * Returns:
> + *
> + * VMDK_OK: on success and @cluster_offset was set
> + *
> + * VMDK_UNALLOC: if no clusters were allocated and @cluster_offset is
> + * set to zero
> + *
> + * VMDK_ERROR: in error cases
Thank you for adding the function documentations!
> + */
> +static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
> + VmdkExtent *extent,
> + VmdkMetaData *m_data, uint64_t offset,
> + bool allocate, uint64_t *cluster_offset,
> + int64_t bytes,
> + uint32_t *total_alloc_clusters)
> +{
> + uint64_t start, remaining;
> + uint64_t new_cluster_offset;
> + int64_t n_bytes;
> + int ret;
> +
> + if (extent->flat) {
> + *cluster_offset = extent->flat_start_offset;
> + return VMDK_OK;
> + }
> +
> + start = offset;
> + remaining = bytes;
> + new_cluster_offset = 0;
> + *cluster_offset = 0;
> + n_bytes = 0;
> + if (m_data) {
> + m_data->valid = 0;
> + }
> +
> + /* due to L2 table margins all bytes may not get allocated at once */
> + while (true) {
> +
> + if (!*cluster_offset) {
> + *cluster_offset = new_cluster_offset;
> + }
> +
> + start += n_bytes;
> + remaining -= n_bytes;
Here, in the first iteration, remaining == bytes and n_bytes == 0.
> + new_cluster_offset += n_bytes;
> +
> + if (remaining == 0) {
> + break;
> + }
> +
> + n_bytes = remaining;
Then n_bytes becomes bytes;
In the second iteration, remaining is always 0 because of "remaining -=
n_bytes". What's the point of the while loop?
> +
> + ret = handle_alloc(bs, extent, start, &new_cluster_offset, &n_bytes,
> + m_data, allocate, total_alloc_clusters);
> +
> + if (ret < 0) {
> + return ret;
> +
> + }
> }
>
> return VMDK_OK;
> @@ -1567,6 +1842,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> uint64_t cluster_offset;
> uint64_t bytes_done = 0;
> VmdkMetaData m_data;
> + uint32_t total_alloc_clusters = 0;
>
> if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
> error_report("Wrong offset: offset=0x%" PRIx64
> @@ -1584,10 +1860,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
> - offset_in_cluster);
>
> - ret = get_cluster_offset(bs, extent, &m_data, offset,
> - !(extent->compressed || zeroed),
> - &cluster_offset, offset_in_cluster,
> - offset_in_cluster + n_bytes);
> + ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
> + !(extent->compressed || zeroed),
> + &cluster_offset, n_bytes,
> + &total_alloc_clusters);
> if (extent->compressed) {
> if (ret == VMDK_OK) {
> /* Refuse write to allocated cluster for streamOptimized */
> @@ -1596,19 +1872,22 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> return -EIO;
> } else {
> /* allocate */
> - ret = get_cluster_offset(bs, extent, &m_data, offset,
> - true, &cluster_offset, 0, 0);
> + ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
> + true, &cluster_offset, n_bytes,
> + &total_alloc_clusters);
Parameter list is no longer aligned now.
> }
> }
> if (ret == VMDK_ERROR) {
> return -EINVAL;
> }
> +
> if (zeroed) {
> /* Do zeroed write, buf is ignored */
> - if (extent->has_zero_grain &&
> - offset_in_cluster == 0 &&
> - n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
> - n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
> + if (extent->has_zero_grain && offset_in_cluster == 0 &&
> + n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE *
> + total_alloc_clusters) {
> + n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE *
> + total_alloc_clusters;
> if (!zero_dry_run) {
> /* update L2 tables */
> if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
> --
> 2.6.2
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters
2017-04-19 12:56 ` Fam Zheng
@ 2017-04-19 15:13 ` Ashijeet Acharya
2017-04-20 0:47 ` Fam Zheng
0 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-19 15:13 UTC (permalink / raw)
To: Fam Zheng; +Cc: jsnow, kwolf, mreitz, qemu-block, qemu-devel, stefanha
On Wed, Apr 19, 2017 at 18:26 Fam Zheng <famz@redhat.com> wrote:
> On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> > Move the cluster tables loading code out of the existing
> > get_cluster_offset() function to avoid code duplication and implement it
> > in separate get_cluster_table() and vmdk_L2load() functions.
> >
> > Introduce two new helper functions handle_alloc() and
> > vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple
> > clusters at once starting from a given offset on disk and performs COW
> > if necessary for first and last allocated clusters.
> > vmdk_alloc_cluster_offset() helps to return the offset of the first of
> > the many newly allocated clusters. Also, provide proper documentation
> > for both.
> >
> > Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> > ---
> > block/vmdk.c | 337
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 308 insertions(+), 29 deletions(-)
> >
> > diff --git a/block/vmdk.c b/block/vmdk.c
> > index 73ae786..e5a289d 100644
> > --- a/block/vmdk.c
> > +++ b/block/vmdk.c
> > @@ -136,6 +136,7 @@ typedef struct VmdkMetaData {
> > unsigned int l2_offset;
> > int valid;
> > uint32_t *l2_cache_entry;
> > + uint32_t nb_clusters;
> > } VmdkMetaData;
> >
> > typedef struct VmdkGrainMarker {
> > @@ -254,6 +255,14 @@ static inline uint64_t
> vmdk_find_offset_in_cluster(VmdkExtent *extent,
> > return extent_relative_offset % cluster_size;
> > }
> >
> > +static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t
> size)
> > +{
> > + uint64_t cluster_size, round_off_size;
> > + cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
> > + round_off_size = cluster_size - (size % cluster_size);
> > + return DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128)
> - 1;
>
> What is (BDRV_SECTOR_SIZE * 128)? Do you mean extent->cluster_size? And
> the
> function doesn't make sense up to me.
>
> Just un-inline this to
>
> DIV_ROUND_UP(size,
> extent->cluster_sectors << BDRV_SECTOR_BITS) - 1
>
> in the calling site and be done with it.
>
> > +}
> > +
> > static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
> > {
> > char *desc;
> > @@ -1028,6 +1037,133 @@ static void vmdk_refresh_limits(BlockDriverState
> *bs, Error **errp)
> > }
> > }
> >
> > +static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> > + uint32_t offset)
> > +{
> > + offset = cpu_to_le32(offset);
> > + /* update L2 table */
> > + if (bdrv_pwrite_sync(extent->file,
> > + ((int64_t)m_data->l2_offset * 512)
> > + + (m_data->l2_index * sizeof(offset)),
> > + &offset, sizeof(offset)) < 0) {
> > + return VMDK_ERROR;
> > + }
> > + /* update backup L2 table */
> > + if (extent->l1_backup_table_offset != 0) {
> > + m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
> > + if (bdrv_pwrite_sync(extent->file,
> > + ((int64_t)m_data->l2_offset * 512)
> > + + (m_data->l2_index * sizeof(offset)),
> > + &offset, sizeof(offset)) < 0) {
> > + return VMDK_ERROR;
> > + }
> > + }
> > + if (m_data->l2_cache_entry) {
> > + *m_data->l2_cache_entry = offset;
> > + }
> > +
> > + return VMDK_OK;
> > +}
> > +
> > +/*
> > + * vmdk_l2load
> > + *
> > + * Loads a new L2 table into memory. If the table is in the cache, the
> cache
>
> Not a native speaker, but s/Loads/Load/ feels more nature and consistent
> with
> other comments.
>
> > + * is used; otherwise the L2 table is loaded from the image file.
> > + *
> > + * Returns:
> > + * VMDK_OK: on success
> > + * VMDK_ERROR: in error cases
> > + */
> > +static int vmdk_l2load(VmdkExtent *extent, uint64_t offset, int
> l2_offset,
> > + uint32_t **new_l2_table, int *new_l2_index)
> > +{
> > + int min_index, i, j;
> > + uint32_t *l2_table;
> > + uint32_t min_count;
> > +
> > + for (i = 0; i < L2_CACHE_SIZE; i++) {
> > + if (l2_offset == extent->l2_cache_offsets[i]) {
> > + /* increment the hit count */
> > + if (++extent->l2_cache_counts[i] == UINT32_MAX) {
> > + for (j = 0; j < L2_CACHE_SIZE; j++) {
> > + extent->l2_cache_counts[j] >>= 1;
> > + }
> > + }
> > + l2_table = extent->l2_cache + (i * extent->l2_size);
> > + goto found;
> > + }
> > + }
> > + /* not found: load a new entry in the least used one */
> > + min_index = 0;
> > + min_count = UINT32_MAX;
> > + for (i = 0; i < L2_CACHE_SIZE; i++) {
> > + if (extent->l2_cache_counts[i] < min_count) {
> > + min_count = extent->l2_cache_counts[i];
> > + min_index = i;
> > + }
> > + }
> > + l2_table = extent->l2_cache + (min_index * extent->l2_size);
> > + if (bdrv_pread(extent->file,
> > + (int64_t)l2_offset * 512,
> > + l2_table,
> > + extent->l2_size * sizeof(uint32_t)
> > + ) != extent->l2_size * sizeof(uint32_t)) {
> > + return VMDK_ERROR;
> > + }
> > +
> > + extent->l2_cache_offsets[min_index] = l2_offset;
> > + extent->l2_cache_counts[min_index] = 1;
> > +found:
> > + *new_l2_index = ((offset >> 9) / extent->cluster_sectors) %
> extent->l2_size;
> > + *new_l2_table = l2_table;
> > +
> > + return VMDK_OK;
> > +}
> > +
> > +/*
> > + * get_cluster_table
> > + *
> > + * for a given offset, load (and allocate if needed) the l2 table.
> > + *
> > + * Returns:
> > + * VMDK_OK: on success
> > + *
> > + * VMDK_UNALLOC: if cluster is not mapped
> > + *
> > + * VMDK_ERROR: in error cases
> > + */
> > +static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
> > + int *new_l1_index, int *new_l2_offset,
> > + int *new_l2_index, uint32_t **new_l2_table)
> > +{
> > + int l1_index, l2_offset, l2_index;
> > + uint32_t *l2_table;
> > + int ret;
> > +
> > + offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
> > + l1_index = (offset >> 9) / extent->l1_entry_sectors;
> > + if (l1_index >= extent->l1_size) {
> > + return VMDK_ERROR;
> > + }
> > + l2_offset = extent->l1_table[l1_index];
> > + if (!l2_offset) {
> > + return VMDK_UNALLOC;
> > + }
> > +
> > + ret = vmdk_l2load(extent, offset, l2_offset, &l2_table, &l2_index);
> > + if (ret < 0) {
> > + return ret;
> > + }
> > +
> > + *new_l1_index = l1_index;
> > + *new_l2_offset = l2_offset;
> > + *new_l2_index = l2_index;
> > + *new_l2_table = l2_table;
> > +
> > + return VMDK_OK;
> > +}
> > +
>
> Can you move this hunk into patch 4 and put it before this patch? It will
> make
> reviewing a bit easier. (Yes, this patch is already big.)
>
Right, I will change it to as you say. I know its big and I didn't like it
either :(
> > /*
> > * vmdk_perform_cow
> > *
> > @@ -1115,29 +1251,168 @@ exit:
> > return ret;
> > }
> >
> > -static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> > - uint32_t offset)
> > +/*
> > + * handle_alloc
> > + *
> > + * Allocates new clusters for an area that either is yet unallocated or
> needs a
>
> Similar to vmdk_l2load, s/Allocates/Allocate/
>
> > + * copy on write. If *cluster_offset is non_zero, clusters are only
> allocated if
> > + * the new allocation can match the specified host offset.
> > + *
> > + * Returns:
> > + * VMDK_OK: if new clusters were allocated, *bytes may be
> decreased if
> > + * the new allocation doesn't cover all of the
> requested area.
> > + * *cluster_offset is updated to contain the offset of
> the
> > + * first newly allocated cluster.
> > + *
> > + * VMDK_UNALLOC: if no clusters could be allocated. *cluster_offset
> is left
> > + * unchanged.
> > + *
> > + * VMDK_ERROR: in error cases
> > + */
> > +static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> > + uint64_t offset, uint64_t *cluster_offset,
> > + int64_t *bytes, VmdkMetaData *m_data,
> > + bool allocate, uint32_t *total_alloc_clusters)
> > {
> > - offset = cpu_to_le32(offset);
> > - /* update L2 table */
> > - if (bdrv_pwrite_sync(extent->file,
> > - ((int64_t)m_data->l2_offset * 512)
> > - + (m_data->l2_index * sizeof(offset)),
> > - &offset, sizeof(offset)) < 0) {
> > - return VMDK_ERROR;
> > + int l1_index, l2_offset, l2_index;
> > + uint32_t *l2_table;
> > + uint32_t cluster_sector;
> > + uint32_t nb_clusters;
> > + bool zeroed = false;
> > + uint64_t skip_start_bytes, skip_end_bytes;
> > + int ret;
> > +
> > + ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
> > + &l2_index, &l2_table);
> > + if (ret < 0) {
> > + return ret;
> > }
> > - /* update backup L2 table */
> > - if (extent->l1_backup_table_offset != 0) {
> > - m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
> > - if (bdrv_pwrite_sync(extent->file,
> > - ((int64_t)m_data->l2_offset * 512)
> > - + (m_data->l2_index * sizeof(offset)),
> > - &offset, sizeof(offset)) < 0) {
> > - return VMDK_ERROR;
> > +
> > + cluster_sector = le32_to_cpu(l2_table[l2_index]);
> > +
> > + skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
> > + /* Calculate the number of clusters to look for. Here it will
> return one
> > + * cluster less than the actual value calculated as we may need to
> perfrom
> > + * COW for the last one. */
> > + nb_clusters = size_to_clusters(extent, skip_start_bytes + *bytes);
> > +
> > + nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
> > + assert(nb_clusters <= INT_MAX);
> > +
> > + /* update bytes according to final nb_clusters value */
> > + if (nb_clusters != 0) {
> > + *bytes = ((nb_clusters * extent->cluster_sectors) << 9)
> > + - skip_start_bytes;
[continuation of why the while loop?]....here. So the bytes may get reduced
if nb_clusters were more than 512 (l2 table margin) . Thus @remaining down
there won't necessarily be zero after first pass. I hope I explained it
correctly!
>
> > + } else {
> > + nb_clusters = 1;
> > + }
> > + *total_alloc_clusters += nb_clusters;
> > + skip_end_bytes = skip_start_bytes + MIN(*bytes,
> > + extent->cluster_sectors * BDRV_SECTOR_SIZE
> > + - skip_start_bytes);
> > +
> > + if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
> > + zeroed = true;
> > + }
> > +
> > + if (!cluster_sector || zeroed) {
> > + if (!allocate) {
> > + return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
> > + }
> > +
> > + cluster_sector = extent->next_cluster_sector;
> > + extent->next_cluster_sector += extent->cluster_sectors
> > + * nb_clusters;
> > +
> > + ret = vmdk_perform_cow(bs, extent, cluster_sector *
> BDRV_SECTOR_SIZE,
> > + offset, skip_start_bytes,
> > + skip_end_bytes);
> > + if (ret < 0) {
> > + return ret;
> > + }
> > + if (m_data) {
> > + m_data->valid = 1;
> > + m_data->l1_index = l1_index;
> > + m_data->l2_index = l2_index;
> > + m_data->l2_offset = l2_offset;
> > + m_data->l2_cache_entry = &l2_table[l2_index];
> > + m_data->nb_clusters = nb_clusters;
> > }
> > }
> > - if (m_data->l2_cache_entry) {
> > - *m_data->l2_cache_entry = offset;
> > + *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
> > + return VMDK_OK;
> > +}
> > +
> > +/*
> > + * vmdk_alloc_cluster_offset
>
> Maybe just name it "vmdk_alloc_clusters", which sounds better to me?
> Because the
> clusters are what we allocate here, it's rather
> "vmdk_alloc_clusters_and_get_offset" but we probably don't want it that
> long.
>
> > + *
> > + * For a given offset on the virtual disk, find the cluster offset in
> vmdk
> > + * file. If the offset is not found, allocate a new cluster.
> > + *
> > + * If the cluster is newly allocated, m_data->nb_clusters is set to the
> number
> > + * of contiguous clusters that have been allocated. In this case, the
> other
> > + * fields of m_data are valid and contain information about the first
> allocated
> > + * cluster.
> > + *
> > + * Returns:
> > + *
> > + * VMDK_OK: on success and @cluster_offset was set
> > + *
> > + * VMDK_UNALLOC: if no clusters were allocated and
> @cluster_offset is
> > + * set to zero
> > + *
> > + * VMDK_ERROR: in error cases
>
> Thank you for adding the function documentations!
>
> > + */
> > +static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
> > + VmdkExtent *extent,
> > + VmdkMetaData *m_data, uint64_t
> offset,
> > + bool allocate, uint64_t
> *cluster_offset,
> > + int64_t bytes,
> > + uint32_t *total_alloc_clusters)
> > +{
> > + uint64_t start, remaining;
> > + uint64_t new_cluster_offset;
> > + int64_t n_bytes;
> > + int ret;
> > +
> > + if (extent->flat) {
> > + *cluster_offset = extent->flat_start_offset;
> > + return VMDK_OK;
> > + }
> > +
> > + start = offset;
> > + remaining = bytes;
> > + new_cluster_offset = 0;
> > + *cluster_offset = 0;
> > + n_bytes = 0;
> > + if (m_data) {
> > + m_data->valid = 0;
> > + }
> > +
> > + /* due to L2 table margins all bytes may not get allocated at once
> */
> > + while (true) {
> > +
> > + if (!*cluster_offset) {
> > + *cluster_offset = new_cluster_offset;
> > + }
> > +
> > + start += n_bytes;
> > + remaining -= n_bytes;
>
> Here, in the first iteration, remaining == bytes and n_bytes == 0.
>
> > + new_cluster_offset += n_bytes;
> > +
> > + if (remaining == 0) {
> > + break;
> > + }
> > +
> > + n_bytes = remaining;
>
> Then n_bytes becomes bytes;
>
> In the second iteration, remaining is always 0 because of "remaining -=
> n_bytes". What's the point of the while loop?
I need the while loop in case if I truncate the bytes according to the L2
table margins....[scroll up to handle alloc() __^ ]
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters
2017-04-19 15:13 ` Ashijeet Acharya
@ 2017-04-20 0:47 ` Fam Zheng
0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-04-20 0:47 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: jsnow, kwolf, mreitz, qemu-block, qemu-devel, stefanha
On Wed, 04/19 15:13, Ashijeet Acharya wrote:
> > In the second iteration, remaining is always 0 because of "remaining -=
> > n_bytes". What's the point of the while loop?
>
>
> I need the while loop in case if I truncate the bytes according to the L2
> table margins....[scroll up to handle alloc() __^ ]
Yes, I see it now.
Fam
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
` (2 preceding siblings ...)
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-19 12:57 ` Fam Zheng
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters Ashijeet Acharya
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Rename the existing get_cluster_offset() function to
vmdk_get_cluster_offset() and have it make use of the new
get_cluster_table() to load the cluster tables. Also, it is no longer
used to allocate new clusters and hence perform COW. Make the necessary
renames at all the occurrences of get_cluster_offset().
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 117 +++++++++++------------------------------------------------
1 file changed, 21 insertions(+), 96 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index e5a289d..a8babd7 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1419,7 +1419,7 @@ static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
}
/**
- * get_cluster_offset
+ * vmdk_get_cluster_offset
*
* Look up cluster offset in extent file by sector number, and store in
* @cluster_offset.
@@ -1427,84 +1427,34 @@ static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
* For flat extents, the start offset as parsed from the description file is
* returned.
*
- * For sparse extents, look up in L1, L2 table. If allocate is true, return an
- * offset for a new cluster and update L2 cache. If there is a backing file,
- * COW is done before returning; otherwise, zeroes are written to the allocated
- * cluster. Both COW and zero writing skips the sector range
- * [@skip_start_sector, @skip_end_sector) passed in by caller, because caller
- * has new data to write there.
+ * For sparse extents, look up in L1, L2 table.
*
* Returns: VMDK_OK if cluster exists and mapped in the image.
- * VMDK_UNALLOC if cluster is not mapped and @allocate is false.
- * VMDK_ERROR if failed.
+ * VMDK_UNALLOC if cluster is not mapped.
+ * VMDK_ERROR if failed
*/
-static int get_cluster_offset(BlockDriverState *bs,
- VmdkExtent *extent,
- VmdkMetaData *m_data,
- uint64_t offset,
- bool allocate,
- uint64_t *cluster_offset,
- uint64_t skip_start_bytes,
- uint64_t skip_end_bytes)
+static int vmdk_get_cluster_offset(BlockDriverState *bs,
+ VmdkExtent *extent,
+ uint64_t offset,
+ uint64_t *cluster_offset)
{
- unsigned int l1_index, l2_offset, l2_index;
- int min_index, i, j;
- uint32_t min_count, *l2_table;
+ int l1_index, l2_offset, l2_index;
+ uint32_t *l2_table;
bool zeroed = false;
int64_t ret;
int64_t cluster_sector;
- if (m_data) {
- m_data->valid = 0;
- }
if (extent->flat) {
*cluster_offset = extent->flat_start_offset;
return VMDK_OK;
}
- offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
- l1_index = (offset >> 9) / extent->l1_entry_sectors;
- if (l1_index >= extent->l1_size) {
- return VMDK_ERROR;
- }
- l2_offset = extent->l1_table[l1_index];
- if (!l2_offset) {
- return VMDK_UNALLOC;
- }
- for (i = 0; i < L2_CACHE_SIZE; i++) {
- if (l2_offset == extent->l2_cache_offsets[i]) {
- /* increment the hit count */
- if (++extent->l2_cache_counts[i] == 0xffffffff) {
- for (j = 0; j < L2_CACHE_SIZE; j++) {
- extent->l2_cache_counts[j] >>= 1;
- }
- }
- l2_table = extent->l2_cache + (i * extent->l2_size);
- goto found;
- }
- }
- /* not found: load a new entry in the least used one */
- min_index = 0;
- min_count = 0xffffffff;
- for (i = 0; i < L2_CACHE_SIZE; i++) {
- if (extent->l2_cache_counts[i] < min_count) {
- min_count = extent->l2_cache_counts[i];
- min_index = i;
- }
- }
- l2_table = extent->l2_cache + (min_index * extent->l2_size);
- if (bdrv_pread(extent->file,
- (int64_t)l2_offset * 512,
- l2_table,
- extent->l2_size * sizeof(uint32_t)
- ) != extent->l2_size * sizeof(uint32_t)) {
- return VMDK_ERROR;
+ ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
+ &l2_index, &l2_table);
+ if (ret < 0) {
+ return ret;
}
- extent->l2_cache_offsets[min_index] = l2_offset;
- extent->l2_cache_counts[min_index] = 1;
- found:
- l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
cluster_sector = le32_to_cpu(l2_table[l2_index]);
if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
@@ -1512,31 +1462,9 @@ static int get_cluster_offset(BlockDriverState *bs,
}
if (!cluster_sector || zeroed) {
- if (!allocate) {
- return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
- }
-
- cluster_sector = extent->next_cluster_sector;
- extent->next_cluster_sector += extent->cluster_sectors;
-
- /* First of all we write grain itself, to avoid race condition
- * that may to corrupt the image.
- * This problem may occur because of insufficient space on host disk
- * or inappropriate VM shutdown.
- */
- ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
- offset, skip_start_bytes, skip_end_bytes);
- if (ret) {
- return ret;
- }
- if (m_data) {
- m_data->valid = 1;
- m_data->l1_index = l1_index;
- m_data->l2_index = l2_index;
- m_data->l2_offset = l2_offset;
- m_data->l2_cache_entry = &l2_table[l2_index];
- }
+ return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
}
+
*cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
return VMDK_OK;
}
@@ -1579,9 +1507,7 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
return 0;
}
qemu_co_mutex_lock(&s->lock);
- ret = get_cluster_offset(bs, extent, NULL,
- sector_num * 512, false, &offset,
- 0, 0);
+ ret = vmdk_get_cluster_offset(bs, extent, sector_num * 512, &offset);
qemu_co_mutex_unlock(&s->lock);
index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
@@ -1772,13 +1698,13 @@ vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
ret = -EIO;
goto fail;
}
- ret = get_cluster_offset(bs, extent, NULL,
- offset, false, &cluster_offset, 0, 0);
offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
- offset_in_cluster);
+ ret = vmdk_get_cluster_offset(bs, extent, offset, &cluster_offset);
+
if (ret != VMDK_OK) {
/* if not allocated, try to read from parent image, if exist */
if (bs->backing && ret != VMDK_ZEROED) {
@@ -2508,9 +2434,8 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
sector_num);
break;
}
- ret = get_cluster_offset(bs, extent, NULL,
- sector_num << BDRV_SECTOR_BITS,
- false, &cluster_offset, 0, 0);
+ ret = vmdk_get_cluster_offset(bs, extent,
+ sector_num << BDRV_SECTOR_BITS, &cluster_offset);
if (ret == VMDK_ERROR) {
fprintf(stderr,
"ERROR: could not get cluster_offset for sector %"
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
@ 2017-04-19 12:57 ` Fam Zheng
2017-04-19 15:21 ` Ashijeet Acharya
0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2017-04-19 12:57 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block
On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> Rename the existing get_cluster_offset() function to
> vmdk_get_cluster_offset() and have it make use of the new
> get_cluster_table() to load the cluster tables. Also, it is no longer
> used to allocate new clusters and hence perform COW. Make the necessary
> renames at all the occurrences of get_cluster_offset().
>
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> ---
> block/vmdk.c | 117 +++++++++++------------------------------------------------
> 1 file changed, 21 insertions(+), 96 deletions(-)
This is definitely more than a function rename, like I said in reply to patch 3,
it could probably be split to smaller ones (rename, and others, for example),
and reordered to make reviewing easier.
Fam
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
2017-04-19 12:57 ` Fam Zheng
@ 2017-04-19 15:21 ` Ashijeet Acharya
2017-04-20 0:45 ` Fam Zheng
0 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-19 15:21 UTC (permalink / raw)
To: Fam Zheng; +Cc: jsnow, kwolf, mreitz, qemu-block, qemu-devel, stefanha
On Wed, Apr 19, 2017 at 18:27 Fam Zheng <famz@redhat.com> wrote:
> On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> > Rename the existing get_cluster_offset() function to
> > vmdk_get_cluster_offset() and have it make use of the new
> > get_cluster_table() to load the cluster tables. Also, it is no longer
> > used to allocate new clusters and hence perform COW. Make the necessary
> > renames at all the occurrences of get_cluster_offset().
> >
> > Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> > ---
> > block/vmdk.c | 117
> +++++++++++------------------------------------------------
> > 1 file changed, 21 insertions(+), 96 deletions(-)
>
> This is definitely more than a function rename, like I said in reply to
> patch 3,
> it could probably be split to smaller ones (rename, and others, for
> example),
> and reordered to make reviewing easier.
Maybe, because I have also refactored it to have vmdk_get_cluster_offset()
make use of the get_cluster_table() (and friends) to avoid duplication.
I will try to split it as
1. Rename
2. Refactor it to make use of get_cluster_table() by moving that out of
patch 3 as of now.
Will that work?
I think this will also keep the compiler happy while reviewing.
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
2017-04-19 15:21 ` Ashijeet Acharya
@ 2017-04-20 0:45 ` Fam Zheng
0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-04-20 0:45 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, qemu-block, stefanha, qemu-devel, mreitz, jsnow
On Wed, 04/19 15:21, Ashijeet Acharya wrote:
> On Wed, Apr 19, 2017 at 18:27 Fam Zheng <famz@redhat.com> wrote:
>
> > On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> > > Rename the existing get_cluster_offset() function to
> > > vmdk_get_cluster_offset() and have it make use of the new
> > > get_cluster_table() to load the cluster tables. Also, it is no longer
> > > used to allocate new clusters and hence perform COW. Make the necessary
> > > renames at all the occurrences of get_cluster_offset().
> > >
> > > Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> > > ---
> > > block/vmdk.c | 117
> > +++++++++++------------------------------------------------
> > > 1 file changed, 21 insertions(+), 96 deletions(-)
> >
> > This is definitely more than a function rename, like I said in reply to
> > patch 3,
> > it could probably be split to smaller ones (rename, and others, for
> > example),
> > and reordered to make reviewing easier.
>
>
> Maybe, because I have also refactored it to have vmdk_get_cluster_offset()
> make use of the get_cluster_table() (and friends) to avoid duplication.
>
> I will try to split it as
>
> 1. Rename
> 2. Refactor it to make use of get_cluster_table() by moving that out of
> patch 3 as of now.
>
> Will that work?
Sounds good. Thanks.
Fam
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
` (3 preceding siblings ...)
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 4/6] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-19 13:00 ` Fam Zheng
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters Ashijeet Acharya
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Set the maximum bytes allowed to get allocated at once to be not more
than the extent size boundary to handle writes at two separate extents
appropriately.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index a8babd7..9456ddd 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1767,6 +1767,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
int64_t offset_in_cluster, n_bytes;
uint64_t cluster_offset;
uint64_t bytes_done = 0;
+ uint64_t extent_size;
VmdkMetaData m_data;
uint32_t total_alloc_clusters = 0;
@@ -1782,9 +1783,17 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
if (!extent) {
return -EIO;
}
+ extent_size = extent->end_sector * BDRV_SECTOR_SIZE;
+
offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
- n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
- - offset_in_cluster);
+
+ /* truncate n_bytes to first cluster because we need to perform COW */
+ if (offset_in_cluster > 0) {
+ n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
+ - offset_in_cluster);
+ } else {
+ n_bytes = MIN(bytes, extent_size - offset);
+ }
ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
!(extent->compressed || zeroed),
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
@ 2017-04-19 13:00 ` Fam Zheng
2017-04-21 14:53 ` Ashijeet Acharya
0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2017-04-19 13:00 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block
On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> Set the maximum bytes allowed to get allocated at once to be not more
> than the extent size boundary to handle writes at two separate extents
> appropriately.
>
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
> ---
> block/vmdk.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index a8babd7..9456ddd 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1767,6 +1767,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> int64_t offset_in_cluster, n_bytes;
> uint64_t cluster_offset;
> uint64_t bytes_done = 0;
> + uint64_t extent_size;
> VmdkMetaData m_data;
> uint32_t total_alloc_clusters = 0;
>
> @@ -1782,9 +1783,17 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> if (!extent) {
> return -EIO;
> }
> + extent_size = extent->end_sector * BDRV_SECTOR_SIZE;
Maybe extent_end to be more accurate?
> +
> offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
> - n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
> - - offset_in_cluster);
> +
> + /* truncate n_bytes to first cluster because we need to perform COW */
Makes sense, but shouldn't this be squashed into patch patch 3? Because it looks
like it is fixing an intermediate bug.
> + if (offset_in_cluster > 0) {
> + n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
> + - offset_in_cluster);
> + } else {
> + n_bytes = MIN(bytes, extent_size - offset);
> + }
>
> ret = vmdk_alloc_cluster_offset(bs, extent, &m_data, offset,
> !(extent->compressed || zeroed),
> --
> 2.6.2
>
Fam
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle
2017-04-19 13:00 ` Fam Zheng
@ 2017-04-21 14:53 ` Ashijeet Acharya
2017-04-22 4:27 ` Ashijeet Acharya
0 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-21 14:53 UTC (permalink / raw)
To: Fam Zheng
Cc: Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
QEMU Developers, qemu block
On Wed, Apr 19, 2017 at 6:30 PM, Fam Zheng <famz@redhat.com> wrote:
> On Sat, 04/01 20:14, Ashijeet Acharya wrote:
>> Set the maximum bytes allowed to get allocated at once to be not more
>> than the extent size boundary to handle writes at two separate extents
>> appropriately.
>>
>> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
>> ---
>> block/vmdk.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index a8babd7..9456ddd 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -1767,6 +1767,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> int64_t offset_in_cluster, n_bytes;
>> uint64_t cluster_offset;
>> uint64_t bytes_done = 0;
>> + uint64_t extent_size;
>> VmdkMetaData m_data;
>> uint32_t total_alloc_clusters = 0;
>>
>> @@ -1782,9 +1783,17 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> if (!extent) {
>> return -EIO;
>> }
>> + extent_size = extent->end_sector * BDRV_SECTOR_SIZE;
>
> Maybe extent_end to be more accurate?
Done
>> +
>> offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
>> - n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
>> - - offset_in_cluster);
>> +
>> + /* truncate n_bytes to first cluster because we need to perform COW */
>
> Makes sense, but shouldn't this be squashed into patch patch 3? Because it looks
> like it is fixing an intermediate bug.
Did you mean that I should merge this whole patch into patch 3? Maybe
moving it before patch 3 rather than squashing it make more sense?
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle
2017-04-21 14:53 ` Ashijeet Acharya
@ 2017-04-22 4:27 ` Ashijeet Acharya
0 siblings, 0 replies; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-22 4:27 UTC (permalink / raw)
To: Fam Zheng
Cc: Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
QEMU Developers, qemu block
On Fri, Apr 21, 2017 at 8:23 PM, Ashijeet Acharya
<ashijeetacharya@gmail.com> wrote:
> On Wed, Apr 19, 2017 at 6:30 PM, Fam Zheng <famz@redhat.com> wrote:
>> On Sat, 04/01 20:14, Ashijeet Acharya wrote:
>>> Set the maximum bytes allowed to get allocated at once to be not more
>>> than the extent size boundary to handle writes at two separate extents
>>> appropriately.
>>>
>>> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
>>> ---
>>> block/vmdk.c | 13 +++++++++++--
>>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/vmdk.c b/block/vmdk.c
>>> index a8babd7..9456ddd 100644
>>> --- a/block/vmdk.c
>>> +++ b/block/vmdk.c
>>> @@ -1767,6 +1767,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>>> int64_t offset_in_cluster, n_bytes;
>>> uint64_t cluster_offset;
>>> uint64_t bytes_done = 0;
>>> + uint64_t extent_size;
>>> VmdkMetaData m_data;
>>> uint32_t total_alloc_clusters = 0;
>>>
>>> @@ -1782,9 +1783,17 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>>> if (!extent) {
>>> return -EIO;
>>> }
>>> + extent_size = extent->end_sector * BDRV_SECTOR_SIZE;
>>
>> Maybe extent_end to be more accurate?
>
> Done
>
>>> +
>>> offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
>>> - n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
>>> - - offset_in_cluster);
>>> +
>>> + /* truncate n_bytes to first cluster because we need to perform COW */
>>
>> Makes sense, but shouldn't this be squashed into patch patch 3? Because it looks
>> like it is fixing an intermediate bug.
>
> Did you mean that I should merge this whole patch into patch 3? Maybe
> moving it before patch 3 rather than squashing it make more sense?
Instead I have moved it before patch 3 in v4
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters
2017-04-01 14:44 [Qemu-devel] [PATCH v3 0/6] Optiomize VMDK I/O by allocating multiple clusters Ashijeet Acharya
` (4 preceding siblings ...)
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 5/6] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
@ 2017-04-01 14:44 ` Ashijeet Acharya
2017-04-21 8:15 ` Fam Zheng
5 siblings, 1 reply; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-01 14:44 UTC (permalink / raw)
To: famz
Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block,
Ashijeet Acharya
Include a next pointer in VmdkMetaData struct to point to the previous
allocated L2 table. Modify vmdk_L2update to start updating metadata for
allocation of multiple clusters at once.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
block/vmdk.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 111 insertions(+), 25 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index 9456ddd..c7675db 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -137,6 +137,8 @@ typedef struct VmdkMetaData {
int valid;
uint32_t *l2_cache_entry;
uint32_t nb_clusters;
+ uint32_t offset;
+ struct VmdkMetaData *next;
} VmdkMetaData;
typedef struct VmdkGrainMarker {
@@ -263,6 +265,12 @@ static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size)
return (DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1);
}
+static inline int64_t vmdk_align_offset(int64_t offset, int n)
+{
+ offset = (offset + n - 1) & ~(n - 1);
+ return offset;
+}
+
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
{
char *desc;
@@ -1037,29 +1045,88 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
-static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
- uint32_t offset)
+static int vmdk_alloc_cluster_link_l2(VmdkExtent *extent,
+ VmdkMetaData *m_data, bool zeroed)
{
- offset = cpu_to_le32(offset);
+ int i;
+ uint32_t offset, temp_offset;
+ int *l2_table_array;
+ int l2_array_size;
+
+ if (zeroed) {
+ temp_offset = VMDK_GTE_ZEROED;
+ } else {
+ temp_offset = m_data->offset;
+ }
+
+ temp_offset = cpu_to_le32(temp_offset);
+
+ l2_array_size = sizeof(uint32_t) * m_data->nb_clusters;
+ l2_table_array = qemu_try_blockalign(extent->file->bs,
+ vmdk_align_offset(l2_array_size, 512));
+ if (l2_table_array == NULL) {
+ return VMDK_ERROR;
+ }
+ memset(l2_table_array, 0, vmdk_align_offset(l2_array_size, 512));
+
/* update L2 table */
+ offset = temp_offset;
+ for (i = 0; i < m_data->nb_clusters; i++) {
+ l2_table_array[i] = offset;
+ if (!zeroed) {
+ offset += 128;
+ }
+ }
+
if (bdrv_pwrite_sync(extent->file,
- ((int64_t)m_data->l2_offset * 512)
- + (m_data->l2_index * sizeof(offset)),
- &offset, sizeof(offset)) < 0) {
+ ((int64_t)m_data->l2_offset * 512)
+ + ((m_data->l2_index) * sizeof(offset)),
+ l2_table_array, l2_array_size) < 0) {
return VMDK_ERROR;
}
+
/* update backup L2 table */
if (extent->l1_backup_table_offset != 0) {
m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
if (bdrv_pwrite_sync(extent->file,
((int64_t)m_data->l2_offset * 512)
- + (m_data->l2_index * sizeof(offset)),
- &offset, sizeof(offset)) < 0) {
+ + ((m_data->l2_index) * sizeof(offset)),
+ l2_table_array, l2_array_size) < 0) {
return VMDK_ERROR;
}
}
+
+ offset = temp_offset;
if (m_data->l2_cache_entry) {
- *m_data->l2_cache_entry = offset;
+ for (i = 0; i < m_data->nb_clusters; i++) {
+ *m_data->l2_cache_entry = offset;
+ m_data->l2_cache_entry++;
+
+ if (!zeroed) {
+ offset += 128;
+ }
+ }
+ }
+
+ qemu_vfree(l2_table_array);
+ return VMDK_OK;
+}
+
+static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
+ bool zeroed)
+{
+ int ret;
+
+ while (m_data->next != NULL) {
+ VmdkMetaData *next;
+
+ ret = vmdk_alloc_cluster_link_l2(extent, m_data, zeroed);
+ if (ret < 0) {
+ return ret;
+ }
+
+ next = m_data->next;
+ m_data = next;
}
return VMDK_OK;
@@ -1271,7 +1338,7 @@ exit:
*/
static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
uint64_t offset, uint64_t *cluster_offset,
- int64_t *bytes, VmdkMetaData *m_data,
+ int64_t *bytes, VmdkMetaData **m_data,
bool allocate, uint32_t *total_alloc_clusters)
{
int l1_index, l2_offset, l2_index;
@@ -1280,6 +1347,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
uint32_t nb_clusters;
bool zeroed = false;
uint64_t skip_start_bytes, skip_end_bytes;
+ VmdkMetaData *old_m_data;
int ret;
ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
@@ -1330,13 +1398,21 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
if (ret < 0) {
return ret;
}
- if (m_data) {
- m_data->valid = 1;
- m_data->l1_index = l1_index;
- m_data->l2_index = l2_index;
- m_data->l2_offset = l2_offset;
- m_data->l2_cache_entry = &l2_table[l2_index];
- m_data->nb_clusters = nb_clusters;
+
+ if (*m_data) {
+ old_m_data = *m_data;
+ *m_data = g_malloc0(sizeof(**m_data));
+
+ **m_data = (VmdkMetaData) {
+ .valid = 1,
+ .l1_index = l1_index,
+ .l2_index = l2_index,
+ .l2_offset = l2_offset,
+ .l2_cache_entry = &l2_table[l2_index],
+ .nb_clusters = nb_clusters,
+ .offset = cluster_sector,
+ .next = old_m_data,
+ };
}
}
*cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
@@ -1365,7 +1441,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
*/
static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
VmdkExtent *extent,
- VmdkMetaData *m_data, uint64_t offset,
+ VmdkMetaData **m_data, uint64_t offset,
bool allocate, uint64_t *cluster_offset,
int64_t bytes,
uint32_t *total_alloc_clusters)
@@ -1385,8 +1461,8 @@ static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
new_cluster_offset = 0;
*cluster_offset = 0;
n_bytes = 0;
- if (m_data) {
- m_data->valid = 0;
+ if (*m_data) {
+ (*m_data)->valid = 0;
}
/* due to L2 table margins all bytes may not get allocated at once */
@@ -1768,9 +1844,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t cluster_offset;
uint64_t bytes_done = 0;
uint64_t extent_size;
- VmdkMetaData m_data;
+ VmdkMetaData *m_data;
uint32_t total_alloc_clusters = 0;
+ m_data = g_malloc0(sizeof(*m_data));
+
if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
error_report("Wrong offset: offset=0x%" PRIx64
" total_sectors=0x%" PRIx64,
@@ -1779,6 +1857,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
}
while (bytes > 0) {
+ m_data->next = NULL;
extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
if (!extent) {
return -EIO;
@@ -1825,7 +1904,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
total_alloc_clusters;
if (!zero_dry_run) {
/* update L2 tables */
- if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
+ if (vmdk_L2update(extent, m_data, zeroed)
!= VMDK_OK) {
return -EIO;
}
@@ -1839,10 +1918,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
if (ret) {
return ret;
}
- if (m_data.valid) {
+ if (m_data->valid) {
/* update L2 tables */
- if (vmdk_L2update(extent, &m_data,
- cluster_offset >> BDRV_SECTOR_BITS)
+ if (vmdk_L2update(extent, m_data, zeroed)
!= VMDK_OK) {
return -EIO;
}
@@ -1852,6 +1930,13 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
offset += n_bytes;
bytes_done += n_bytes;
+ while (m_data->next != NULL) {
+ VmdkMetaData *next;
+ next = m_data->next;
+ g_free(m_data);
+ m_data = next;
+ }
+
/* update CID on the first write every time the virtual disk is
* opened */
if (!s->cid_updated) {
@@ -1862,6 +1947,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
s->cid_updated = true;
}
}
+ g_free(m_data);
return 0;
}
--
2.6.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters
2017-04-01 14:44 ` [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters Ashijeet Acharya
@ 2017-04-21 8:15 ` Fam Zheng
2017-04-22 4:13 ` Ashijeet Acharya
0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2017-04-21 8:15 UTC (permalink / raw)
To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block
On Sat, 04/01 20:14, Ashijeet Acharya wrote:
> Include a next pointer in VmdkMetaData struct to point to the previous
> allocated L2 table. Modify vmdk_L2update to start updating metadata for
> allocation of multiple clusters at once.
>
> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
This is the metadata part of the coalesed allocation. I think patch 3 is
functionally incomplete without these changes, and is perhaps broken because
metadata is not handled correctly.
Such an "intermediate functional regression" is not good in a series, which we
need to avoid.
> ---
> block/vmdk.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 111 insertions(+), 25 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 9456ddd..c7675db 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -137,6 +137,8 @@ typedef struct VmdkMetaData {
> int valid;
> uint32_t *l2_cache_entry;
> uint32_t nb_clusters;
> + uint32_t offset;
> + struct VmdkMetaData *next;
> } VmdkMetaData;
>
> typedef struct VmdkGrainMarker {
> @@ -263,6 +265,12 @@ static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size)
> return (DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1);
> }
>
> +static inline int64_t vmdk_align_offset(int64_t offset, int n)
> +{
> + offset = (offset + n - 1) & ~(n - 1);
> + return offset;
> +}
> +
> static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
> {
> char *desc;
> @@ -1037,29 +1045,88 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
> }
> }
>
> -static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> - uint32_t offset)
> +static int vmdk_alloc_cluster_link_l2(VmdkExtent *extent,
> + VmdkMetaData *m_data, bool zeroed)
> {
> - offset = cpu_to_le32(offset);
> + int i;
> + uint32_t offset, temp_offset;
> + int *l2_table_array;
> + int l2_array_size;
> +
> + if (zeroed) {
> + temp_offset = VMDK_GTE_ZEROED;
> + } else {
> + temp_offset = m_data->offset;
> + }
> +
> + temp_offset = cpu_to_le32(temp_offset);
> +
> + l2_array_size = sizeof(uint32_t) * m_data->nb_clusters;
> + l2_table_array = qemu_try_blockalign(extent->file->bs,
> + vmdk_align_offset(l2_array_size, 512));
Indentation is off.
Use QEMU_ALIGN_UP, instead of vmdk_align_offset.
512 is a magic number, use BDRV_SECTOR_SIZE.
> + if (l2_table_array == NULL) {
> + return VMDK_ERROR;
> + }
> + memset(l2_table_array, 0, vmdk_align_offset(l2_array_size, 512));
> +
> /* update L2 table */
> + offset = temp_offset;
> + for (i = 0; i < m_data->nb_clusters; i++) {
> + l2_table_array[i] = offset;
> + if (!zeroed) {
> + offset += 128;
Something is going wrong here with endianness on BE host, I believe.
> + }
> + }
> +
> if (bdrv_pwrite_sync(extent->file,
> - ((int64_t)m_data->l2_offset * 512)
> - + (m_data->l2_index * sizeof(offset)),
> - &offset, sizeof(offset)) < 0) {
> + ((int64_t)m_data->l2_offset * 512)
> + + ((m_data->l2_index) * sizeof(offset)),
> + l2_table_array, l2_array_size) < 0) {
You can fix the indentation while changing these lines. If not, don't change it,
or at least don't make it uglier.
> return VMDK_ERROR;
> }
> +
> /* update backup L2 table */
> if (extent->l1_backup_table_offset != 0) {
> m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
> if (bdrv_pwrite_sync(extent->file,
> ((int64_t)m_data->l2_offset * 512)
> - + (m_data->l2_index * sizeof(offset)),
> - &offset, sizeof(offset)) < 0) {
> + + ((m_data->l2_index) * sizeof(offset)),
> + l2_table_array, l2_array_size) < 0) {
Same here.
> return VMDK_ERROR;
> }
> }
> +
> + offset = temp_offset;
> if (m_data->l2_cache_entry) {
> - *m_data->l2_cache_entry = offset;
> + for (i = 0; i < m_data->nb_clusters; i++) {
> + *m_data->l2_cache_entry = offset;
> + m_data->l2_cache_entry++;
> +
> + if (!zeroed) {
> + offset += 128;
> + }
> + }
> + }
> +
> + qemu_vfree(l2_table_array);
> + return VMDK_OK;
> +}
> +
> +static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
> + bool zeroed)
> +{
> + int ret;
> +
> + while (m_data->next != NULL) {
> + VmdkMetaData *next;
> +
> + ret = vmdk_alloc_cluster_link_l2(extent, m_data, zeroed);
> + if (ret < 0) {
> + return ret;
> + }
> +
> + next = m_data->next;
> + m_data = next;
Why not simply "m_data = m_data->next" and drop "next" variable?
> }
>
> return VMDK_OK;
> @@ -1271,7 +1338,7 @@ exit:
> */
> static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> uint64_t offset, uint64_t *cluster_offset,
> - int64_t *bytes, VmdkMetaData *m_data,
> + int64_t *bytes, VmdkMetaData **m_data,
> bool allocate, uint32_t *total_alloc_clusters)
> {
> int l1_index, l2_offset, l2_index;
> @@ -1280,6 +1347,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> uint32_t nb_clusters;
> bool zeroed = false;
> uint64_t skip_start_bytes, skip_end_bytes;
> + VmdkMetaData *old_m_data;
> int ret;
>
> ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
> @@ -1330,13 +1398,21 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> if (ret < 0) {
> return ret;
> }
> - if (m_data) {
> - m_data->valid = 1;
> - m_data->l1_index = l1_index;
> - m_data->l2_index = l2_index;
> - m_data->l2_offset = l2_offset;
> - m_data->l2_cache_entry = &l2_table[l2_index];
> - m_data->nb_clusters = nb_clusters;
> +
> + if (*m_data) {
> + old_m_data = *m_data;
> + *m_data = g_malloc0(sizeof(**m_data));
> +
> + **m_data = (VmdkMetaData) {
> + .valid = 1,
> + .l1_index = l1_index,
> + .l2_index = l2_index,
> + .l2_offset = l2_offset,
> + .l2_cache_entry = &l2_table[l2_index],
> + .nb_clusters = nb_clusters,
> + .offset = cluster_sector,
> + .next = old_m_data,
> + };
I think if the new m_data can be merged into the old, there is no need to
allocate a new one.
> }
> }
> *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
> @@ -1365,7 +1441,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> */
> static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
> VmdkExtent *extent,
> - VmdkMetaData *m_data, uint64_t offset,
> + VmdkMetaData **m_data, uint64_t offset,
> bool allocate, uint64_t *cluster_offset,
> int64_t bytes,
> uint32_t *total_alloc_clusters)
> @@ -1385,8 +1461,8 @@ static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
> new_cluster_offset = 0;
> *cluster_offset = 0;
> n_bytes = 0;
> - if (m_data) {
> - m_data->valid = 0;
> + if (*m_data) {
> + (*m_data)->valid = 0;
> }
>
> /* due to L2 table margins all bytes may not get allocated at once */
> @@ -1768,9 +1844,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> uint64_t cluster_offset;
> uint64_t bytes_done = 0;
> uint64_t extent_size;
> - VmdkMetaData m_data;
> + VmdkMetaData *m_data;
> uint32_t total_alloc_clusters = 0;
>
> + m_data = g_malloc0(sizeof(*m_data));
> +
> if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
> error_report("Wrong offset: offset=0x%" PRIx64
> " total_sectors=0x%" PRIx64,
> @@ -1779,6 +1857,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> }
>
> while (bytes > 0) {
> + m_data->next = NULL;
> extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
> if (!extent) {
> return -EIO;
> @@ -1825,7 +1904,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> total_alloc_clusters;
> if (!zero_dry_run) {
> /* update L2 tables */
> - if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
> + if (vmdk_L2update(extent, m_data, zeroed)
> != VMDK_OK) {
> return -EIO;
> }
> @@ -1839,10 +1918,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> if (ret) {
> return ret;
> }
> - if (m_data.valid) {
> + if (m_data->valid) {
> /* update L2 tables */
> - if (vmdk_L2update(extent, &m_data,
> - cluster_offset >> BDRV_SECTOR_BITS)
> + if (vmdk_L2update(extent, m_data, zeroed)
> != VMDK_OK) {
> return -EIO;
> }
> @@ -1852,6 +1930,13 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> offset += n_bytes;
> bytes_done += n_bytes;
>
> + while (m_data->next != NULL) {
> + VmdkMetaData *next;
> + next = m_data->next;
> + g_free(m_data);
> + m_data = next;
> + }
> +
> /* update CID on the first write every time the virtual disk is
> * opened */
> if (!s->cid_updated) {
> @@ -1862,6 +1947,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
> s->cid_updated = true;
> }
> }
> + g_free(m_data);
This is weird, you free all but the last m_data with a while loop, a few lines
above, and this one with a separate g_free().
Please use one loop:
for (p = m_data; p; p = next) {
next = p->next;
g_free(p);
}
> return 0;
> }
>
> --
> 2.6.2
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH v3 6/6] vmdk: Update metadata for multiple clusters
2017-04-21 8:15 ` Fam Zheng
@ 2017-04-22 4:13 ` Ashijeet Acharya
0 siblings, 0 replies; 21+ messages in thread
From: Ashijeet Acharya @ 2017-04-22 4:13 UTC (permalink / raw)
To: Fam Zheng
Cc: Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
QEMU Developers, qemu block
On Fri, Apr 21, 2017 at 1:45 PM, Fam Zheng <famz@redhat.com> wrote:
> On Sat, 04/01 20:14, Ashijeet Acharya wrote:
>> Include a next pointer in VmdkMetaData struct to point to the previous
>> allocated L2 table. Modify vmdk_L2update to start updating metadata for
>> allocation of multiple clusters at once.
>>
>> Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
>
> This is the metadata part of the coalesed allocation. I think patch 3 is
> functionally incomplete without these changes, and is perhaps broken because
> metadata is not handled correctly.
>
> Such an "intermediate functional regression" is not good in a series, which we
> need to avoid.
I have moved this patch right after patch 3 because merging both will
result in an unnecessary huge patch. Will that work?
>
>> ---
>> block/vmdk.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++++-----------
>> 1 file changed, 111 insertions(+), 25 deletions(-)
>>
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index 9456ddd..c7675db 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -137,6 +137,8 @@ typedef struct VmdkMetaData {
>> int valid;
>> uint32_t *l2_cache_entry;
>> uint32_t nb_clusters;
>> + uint32_t offset;
>> + struct VmdkMetaData *next;
>> } VmdkMetaData;
>>
>> typedef struct VmdkGrainMarker {
>> @@ -263,6 +265,12 @@ static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size)
>> return (DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1);
>> }
>>
>> +static inline int64_t vmdk_align_offset(int64_t offset, int n)
>> +{
>> + offset = (offset + n - 1) & ~(n - 1);
>> + return offset;
>> +}
>> +
>> static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
>> {
>> char *desc;
>> @@ -1037,29 +1045,88 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
>> }
>> }
>>
>> -static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
>> - uint32_t offset)
>> +static int vmdk_alloc_cluster_link_l2(VmdkExtent *extent,
>> + VmdkMetaData *m_data, bool zeroed)
>> {
>> - offset = cpu_to_le32(offset);
>> + int i;
>> + uint32_t offset, temp_offset;
>> + int *l2_table_array;
>> + int l2_array_size;
>> +
>> + if (zeroed) {
>> + temp_offset = VMDK_GTE_ZEROED;
>> + } else {
>> + temp_offset = m_data->offset;
>> + }
>> +
>> + temp_offset = cpu_to_le32(temp_offset);
>> +
>> + l2_array_size = sizeof(uint32_t) * m_data->nb_clusters;
>> + l2_table_array = qemu_try_blockalign(extent->file->bs,
>> + vmdk_align_offset(l2_array_size, 512));
>
> Indentation is off.
>
> Use QEMU_ALIGN_UP, instead of vmdk_align_offset.
>
> 512 is a magic number, use BDRV_SECTOR_SIZE.
Done
>
>> + if (l2_table_array == NULL) {
>> + return VMDK_ERROR;
>> + }
>> + memset(l2_table_array, 0, vmdk_align_offset(l2_array_size, 512));
>> +
>> /* update L2 table */
>> + offset = temp_offset;
>> + for (i = 0; i < m_data->nb_clusters; i++) {
>> + l2_table_array[i] = offset;
>> + if (!zeroed) {
>> + offset += 128;
>
> Something is going wrong here with endianness on BE host, I believe.
I have changed temp_offset to LE above, wouldn't that be enough. I am not sure.
>
>> + }
>> + }
>> +
>> if (bdrv_pwrite_sync(extent->file,
>> - ((int64_t)m_data->l2_offset * 512)
>> - + (m_data->l2_index * sizeof(offset)),
>> - &offset, sizeof(offset)) < 0) {
>> + ((int64_t)m_data->l2_offset * 512)
>> + + ((m_data->l2_index) * sizeof(offset)),
>> + l2_table_array, l2_array_size) < 0) {
>
> You can fix the indentation while changing these lines. If not, don't change it,
> or at least don't make it uglier.
I have aligned it, if it still looks ugly in v4, I will revert.
>
>> return VMDK_ERROR;
>> }
>> +
>> /* update backup L2 table */
>> if (extent->l1_backup_table_offset != 0) {
>> m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
>> if (bdrv_pwrite_sync(extent->file,
>> ((int64_t)m_data->l2_offset * 512)
>> - + (m_data->l2_index * sizeof(offset)),
>> - &offset, sizeof(offset)) < 0) {
>> + + ((m_data->l2_index) * sizeof(offset)),
>> + l2_table_array, l2_array_size) < 0) {
>
> Same here.
>
>> return VMDK_ERROR;
>> }
>> }
>> +
>> + offset = temp_offset;
>> if (m_data->l2_cache_entry) {
>> - *m_data->l2_cache_entry = offset;
>> + for (i = 0; i < m_data->nb_clusters; i++) {
>> + *m_data->l2_cache_entry = offset;
>> + m_data->l2_cache_entry++;
>> +
>> + if (!zeroed) {
>> + offset += 128;
>> + }
>> + }
>> + }
>> +
>> + qemu_vfree(l2_table_array);
>> + return VMDK_OK;
>> +}
>> +
>> +static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
>> + bool zeroed)
>> +{
>> + int ret;
>> +
>> + while (m_data->next != NULL) {
>> + VmdkMetaData *next;
>> +
>> + ret = vmdk_alloc_cluster_link_l2(extent, m_data, zeroed);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> +
>> + next = m_data->next;
>> + m_data = next;
>
> Why not simply "m_data = m_data->next" and drop "next" variable?
>> }
>>
>> return VMDK_OK;
>> @@ -1271,7 +1338,7 @@ exit:
>> */
>> static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
>> uint64_t offset, uint64_t *cluster_offset,
>> - int64_t *bytes, VmdkMetaData *m_data,
>> + int64_t *bytes, VmdkMetaData **m_data,
>> bool allocate, uint32_t *total_alloc_clusters)
>> {
>> int l1_index, l2_offset, l2_index;
>> @@ -1280,6 +1347,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
>> uint32_t nb_clusters;
>> bool zeroed = false;
>> uint64_t skip_start_bytes, skip_end_bytes;
>> + VmdkMetaData *old_m_data;
>> int ret;
>>
>> ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
>> @@ -1330,13 +1398,21 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
>> if (ret < 0) {
>> return ret;
>> }
>> - if (m_data) {
>> - m_data->valid = 1;
>> - m_data->l1_index = l1_index;
>> - m_data->l2_index = l2_index;
>> - m_data->l2_offset = l2_offset;
>> - m_data->l2_cache_entry = &l2_table[l2_index];
>> - m_data->nb_clusters = nb_clusters;
>> +
>> + if (*m_data) {
>> + old_m_data = *m_data;
>> + *m_data = g_malloc0(sizeof(**m_data));
>> +
>> + **m_data = (VmdkMetaData) {
>> + .valid = 1,
>> + .l1_index = l1_index,
>> + .l2_index = l2_index,
>> + .l2_offset = l2_offset,
>> + .l2_cache_entry = &l2_table[l2_index],
>> + .nb_clusters = nb_clusters,
>> + .offset = cluster_sector,
>> + .next = old_m_data,
>> + };
>
> I think if the new m_data can be merged into the old, there is no need to
> allocate a new one.
Do you mean that if the clusters lie in the same l2 table, then merge
them? I think this case only appears when I leave out the first and
last cluster for COW. If I misunderstood, sorry!
I think I will post v4 without attending this issue and we can discuss
this when you are available after the weekend.
>
>> }
>> }
>> *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
>> @@ -1365,7 +1441,7 @@ static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
>> */
>> static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
>> VmdkExtent *extent,
>> - VmdkMetaData *m_data, uint64_t offset,
>> + VmdkMetaData **m_data, uint64_t offset,
>> bool allocate, uint64_t *cluster_offset,
>> int64_t bytes,
>> uint32_t *total_alloc_clusters)
>> @@ -1385,8 +1461,8 @@ static int vmdk_alloc_cluster_offset(BlockDriverState *bs,
>> new_cluster_offset = 0;
>> *cluster_offset = 0;
>> n_bytes = 0;
>> - if (m_data) {
>> - m_data->valid = 0;
>> + if (*m_data) {
>> + (*m_data)->valid = 0;
>> }
>>
>> /* due to L2 table margins all bytes may not get allocated at once */
>> @@ -1768,9 +1844,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> uint64_t cluster_offset;
>> uint64_t bytes_done = 0;
>> uint64_t extent_size;
>> - VmdkMetaData m_data;
>> + VmdkMetaData *m_data;
>> uint32_t total_alloc_clusters = 0;
>>
>> + m_data = g_malloc0(sizeof(*m_data));
>> +
[scroll till here] [1] So this allocation will need to move....[2]
>> if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
>> error_report("Wrong offset: offset=0x%" PRIx64
>> " total_sectors=0x%" PRIx64,
>> @@ -1779,6 +1857,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> }
>>
>> while (bytes > 0) {
....[2] here. Thus we will need to allocate it again every time we
enter here otherwise the very next line m_data->next=NULL will
segfault.
So maybe its good to free it separately?
I will retain it this way for v4 and change it otherwise if you still
say so after my reasoning in v5.
>> + m_data->next = NULL;
>> extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
>> if (!extent) {
>> return -EIO;
>> @@ -1825,7 +1904,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> total_alloc_clusters;
>> if (!zero_dry_run) {
>> /* update L2 tables */
>> - if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
>> + if (vmdk_L2update(extent, m_data, zeroed)
>> != VMDK_OK) {
>> return -EIO;
>> }
>> @@ -1839,10 +1918,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> if (ret) {
>> return ret;
>> }
>> - if (m_data.valid) {
>> + if (m_data->valid) {
>> /* update L2 tables */
>> - if (vmdk_L2update(extent, &m_data,
>> - cluster_offset >> BDRV_SECTOR_BITS)
>> + if (vmdk_L2update(extent, m_data, zeroed)
>> != VMDK_OK) {
>> return -EIO;
>> }
>> @@ -1852,6 +1930,13 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> offset += n_bytes;
>> bytes_done += n_bytes;
>>
>> + while (m_data->next != NULL) {
>> + VmdkMetaData *next;
>> + next = m_data->next;
>> + g_free(m_data);
>> + m_data = next;
>> + }
>> +
>> /* update CID on the first write every time the virtual disk is
>> * opened */
>> if (!s->cid_updated) {
>> @@ -1862,6 +1947,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>> s->cid_updated = true;
>> }
>> }
>> + g_free(m_data);
>
> This is weird, you free all but the last m_data with a while loop, a few lines
> above, and this one with a separate g_free().
>
> Please use one loop:
>
> for (p = m_data; p; p = next) {
> next = p->next;
> g_free(p);
> }
I have a good (maybe good enough) reason for it, if I free it in the
while loop above, then I will need to allocate it again when we enter
the superior while(bytes>0) loop, otherwise we will segfault for
everything from that point onwards....[scroll up __^] [1]
Ashijeet
^ permalink raw reply [flat|nested] 21+ messages in thread