Re: [Qemu-devel] [PATCH v6 6/8] vmdk: New functions to assist allocating multiple clusters

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Fam Zheng <famz@redhat.com>
To: Ashijeet Acharya <ashijeetacharya@gmail.com>
Cc: kwolf@redhat.com, jsnow@redhat.com, mreitz@redhat.com,
	stefanha@gmail.com, qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v6 6/8] vmdk: New functions to assist allocating multiple clusters
Date: Tue, 27 Jun 2017 16:02:10 +0800	[thread overview]
Message-ID: <20170627080210.GB14166@lemon.lan> (raw)
In-Reply-To: <1496649172-26982-7-git-send-email-ashijeetacharya@gmail.com>

On Mon, 06/05 13:22, Ashijeet Acharya wrote:
> +/*
> + * vmdk_handle_alloc
> + *
> + * Allocate new clusters for an area that either is yet unallocated or needs a
> + * copy on write. If *cluster_offset is non_zero, clusters are only allocated if
> + * the new allocation can match the specified host offset.

I don't think this matches the function body, the passed in *cluster_offset
value is ignored.

> + *
> + * Returns:
> + *   VMDK_OK:       if new clusters were allocated, *bytes may be decreased if
> + *                  the new allocation doesn't cover all of the requested area.
> + *                  *cluster_offset is updated to contain the offset of the
> + *                  first newly allocated cluster.
> + *
> + *   VMDK_UNALLOC:  if no clusters could be allocated. *cluster_offset is left
> + *                  unchanged.
> + *
> + *   VMDK_ERROR:    in error cases
> + */
> +static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> +                             uint64_t offset, uint64_t *cluster_offset,
> +                             int64_t *bytes, VmdkMetaData *m_data,
> +                             bool allocate, uint32_t *alloc_clusters_counter)
> +{
> +    int l1_index, l2_offset, l2_index;
> +    uint32_t *l2_table;
> +    uint32_t cluster_sector;
> +    uint32_t nb_clusters;
> +    bool zeroed = false;
> +    uint64_t skip_start_bytes, skip_end_bytes;
> +    int ret;
> +
> +    ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
> +                            &l2_index, &l2_table);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    cluster_sector = le32_to_cpu(l2_table[l2_index]);
> +
> +    skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
> +    /* Calculate the number of clusters to look for. Here we truncate the last
> +     * cluster, i.e. 1 less than the actual value calculated as we may need to
> +     * perform COW for the last one. */
> +    nb_clusters = DIV_ROUND_UP(skip_start_bytes + *bytes,
> +                               extent->cluster_sectors << BDRV_SECTOR_BITS) - 1;
> +
> +    nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
> +    assert(nb_clusters <= INT_MAX);
> +
> +    /* update bytes according to final nb_clusters value */
> +    if (nb_clusters != 0) {
> +        *bytes = ((nb_clusters * extent->cluster_sectors) << BDRV_SECTOR_BITS)
> +                 - skip_start_bytes;
> +    } else {
> +        nb_clusters = 1;
> +    }
> +    *alloc_clusters_counter += nb_clusters;
> +    skip_end_bytes = skip_start_bytes + MIN(*bytes,
> +                     extent->cluster_sectors * BDRV_SECTOR_SIZE
> +                                    - skip_start_bytes);

I don't understand the MIN part, shouldn't skip_end_bytes simply be
skip_start_bytes + *bytes?

> +
> +    if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
> +        zeroed = true;
> +    }
> +
> +    if (!cluster_sector || zeroed) {
> +        if (!allocate) {
> +            return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
> +        }
> +
> +        cluster_sector = extent->next_cluster_sector;
> +        extent->next_cluster_sector += extent->cluster_sectors
> +                                                * nb_clusters;
> +
> +        ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
> +                               offset, skip_start_bytes,
> +                               skip_end_bytes);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +        if (m_data) {
> +            m_data->valid = 1;
> +            m_data->l1_index = l1_index;
> +            m_data->l2_index = l2_index;
> +            m_data->l2_offset = l2_offset;
> +            m_data->l2_cache_entry = &l2_table[l2_index];
> +            m_data->nb_clusters = nb_clusters;
> +        }
> +    }
> +    *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
> +    return VMDK_OK;
> +}
> +
> +/*
> + * vmdk_alloc_clusters
> + *
> + * For a given offset on the virtual disk, find the cluster offset in vmdk
> + * file. If the offset is not found, allocate a new cluster.
> + *
> + * If the cluster is newly allocated, m_data->nb_clusters is set to the number
> + * of contiguous clusters that have been allocated. In this case, the other
> + * fields of m_data are valid and contain information about the first allocated
> + * cluster.
> + *
> + * Returns:
> + *
> + *   VMDK_OK:           on success and @cluster_offset was set
> + *
> + *   VMDK_UNALLOC:      if no clusters were allocated and @cluster_offset is
> + *                      set to zero
> + *
> + *   VMDK_ERROR:        in error cases
> + */
> +static int vmdk_alloc_clusters(BlockDriverState *bs,
> +                               VmdkExtent *extent,
> +                               VmdkMetaData *m_data, uint64_t offset,
> +                               bool allocate, uint64_t *cluster_offset,
> +                               int64_t bytes,
> +                               uint32_t *total_alloc_clusters)
> +{
> +    uint64_t start, remaining;
> +    uint64_t new_cluster_offset;
> +    int64_t n_bytes;
> +    int ret;
> +
> +    if (extent->flat) {
> +        *cluster_offset = extent->flat_start_offset;
> +        return VMDK_OK;
> +    }
> +
> +    start = offset;
> +    remaining = bytes;
> +    new_cluster_offset = 0;
> +    *cluster_offset = 0;
> +    n_bytes = 0;
> +    if (m_data) {
> +        m_data->valid = 0;
> +    }
> +
> +    /* due to L2 table margins all bytes may not get allocated at once */
> +    while (true) {
> +
> +        if (!*cluster_offset) {
> +            *cluster_offset = new_cluster_offset;
> +        }
> +
> +        start              += n_bytes;
> +        remaining          -= n_bytes;
> +        new_cluster_offset += n_bytes;

Like said above, even though you increment new_cluster_offset by n_bytes, it has
no effect inside vmdk_handle_alloc. Is this intended?

> +
> +        if (remaining == 0) {
> +            break;
> +        }
> +
> +        n_bytes = remaining;
> +
> +        ret = vmdk_handle_alloc(bs, extent, start, &new_cluster_offset, &n_bytes,
> +                                m_data, allocate, total_alloc_clusters);
> +
> +        if (ret < 0) {
> +            return ret;
> +
> +        }
> +    }
> +
> +    return VMDK_OK;
> +}
> +
>  /**
>   * vmdk_get_cluster_offset
>   *
> @@ -1625,6 +1794,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>      uint64_t bytes_done = 0;
>      VmdkMetaData m_data;
>      uint64_t extent_end;
> +    uint32_t total_alloc_clusters = 0;
>  
>      if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
>          error_report("Wrong offset: offset=0x%" PRIx64
> @@ -1650,10 +1820,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>              n_bytes = MIN(bytes, extent_end - offset);
>          }
>  
> -        ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
> -                                      !(extent->compressed || zeroed),
> -                                      &cluster_offset, offset_in_cluster,
> -                                      offset_in_cluster + n_bytes);
> +        ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
> +                                  !(extent->compressed || zeroed),
> +                                  &cluster_offset, n_bytes,
> +                                  &total_alloc_clusters);
>          if (extent->compressed) {
>              if (ret == VMDK_OK) {
>                  /* Refuse write to allocated cluster for streamOptimized */
> @@ -1662,8 +1832,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>                  return -EIO;
>              } else {
>                  /* allocate */
> -                ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
> -                                              true, &cluster_offset, 0, 0);
> +                ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
> +                                          true, &cluster_offset, n_bytes,
> +                                          &total_alloc_clusters);
>              }
>          }
>          if (ret == VMDK_ERROR) {
> @@ -1671,10 +1842,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
>          }
>          if (zeroed) {
>              /* Do zeroed write, buf is ignored */
> -            if (extent->has_zero_grain &&
> -                    offset_in_cluster == 0 &&
> -                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
> -                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
> +            if (extent->has_zero_grain && offset_in_cluster == 0 &&
> +                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE *
> +                        total_alloc_clusters) {
> +                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE *
> +                                        total_alloc_clusters;
>                  if (!zero_dry_run) {
>                      /* update L2 tables */
>                      if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
> -- 
> 2.6.2
> 

Fam

next prev parent reply	other threads:[~2017-06-27  8:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-05  7:52 [Qemu-devel] [PATCH v6 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 2/8] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 3/8] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 4/8] vmdk: Factor out metadata loading code out of vmdk_get_cluster_offset() Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 5/8] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 6/8] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
2017-06-27  8:02   ` Fam Zheng [this message]
2017-06-29  7:42     ` Ashijeet Acharya
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 7/8] vmdk: Update metadata for " Ashijeet Acharya
2017-06-27  8:04   ` Fam Zheng
2017-06-29  8:48     ` Ashijeet Acharya
2017-06-27  8:14   ` Fam Zheng
2017-06-05  7:52 ` [Qemu-devel] [PATCH v6 8/8] vmdk: Make vmdk_get_cluster_offset() return cluster offset only Ashijeet Acharya
2017-06-27  8:15   ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170627080210.GB14166@lemon.lan \
    --to=famz@redhat.com \
    --cc=ashijeetacharya@gmail.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).