From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kwol@redhat.com, wdongxu@cn.ibm.com, qemu-devel@nongnu.org,
stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH V18 5/6] add-cow file format core code.
Date: Tue, 23 Apr 2013 09:54:25 +0800 [thread overview]
Message-ID: <5175E9D1.60401@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130418100336.GC19587@stefanha-thinkpad.redhat.com>
On 2013/4/18 18:03, Stefan Hajnoczi wrote:
> On Wed, Apr 10, 2013 at 04:11:52PM +0800, Dong Xu Wang wrote:
>> + header.cluster_bits = ffs(cluster_size) - 1;
>> + if (header.cluster_bits < MIN_CLUSTER_BITS ||
>> + header.cluster_bits > MAX_CLUSTER_BITS ||
>> + (1 << header.cluster_bits) != cluster_size) {
>> + error_report(
>> + "Cluster size must be a power of two between %d and %dk",
>> + 1 << MIN_CLUSTER_BITS, 1 << (MAX_CLUSTER_BITS - 10));
>> + return -EINVAL;
>> + }
>> +
>> + header.header_size = MAX(cluster_size, DEFAULT_HEADER_SIZE);
>
> Indentation.
>
okay.
>> + if (backing_filename) {
>> + header.backing_offset = sizeof(header);
>> + header.backing_size = strlen(backing_filename);
>> +
>> + if (!backing_fmt) {
>> + backing_bs = bdrv_new("image");
>> + ret = bdrv_open(backing_bs, backing_filename, NULL,
>> + BDRV_O_RDWR | BDRV_O_CACHE_WB, NULL);
>> + if (ret < 0) {
>> + return ret;
>
> backing_bs is leaked.
Okay. will fix.
>
>> + ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> + snprintf(header.backing_fmt, sizeof(header.backing_fmt), "%s",
>> + backing_fmt ? backing_fmt : "");
>> + snprintf(header.image_fmt, sizeof(header.image_fmt), "%s",
>> + image_format ? image_format : "raw");
>
> snprintf() doesn't have the semantics in the add-cow specification:
>
> " 44 - 59: backing file format
> Format of backing file. It will be filled with
> 0 if backing file name offset is 0. If backing
> file name offset is non-empty, it must be
> non-empty. It is coded in free-form ASCII, and
> is not NUL-terminated. Zero padded on the right.
>
> 60 - 75: image file format
> Format of image file. It must be non-empty. It
> is coded in free-form ASCII, and is not
> NUL-terminated. Zero padded on the right."
>
> strncpy() does the zero padding and doesn't NUL-terminate if the max buffer
> size is used.
>
Okay.
>> + if ((s->header.compat_features & ACOW_F_ALL_ALLOCATED) == 0) {
>> + snprintf(bs->backing_format, sizeof(bs->backing_format),
>> + "%s", s->header.backing_fmt);
>
> s->header.backing_fmt is not NUL-terminated so using snprintf() is
> inappropriate (could it read beyond the end of .backing_fmt?).
>
Okay.
>> + }
>> +
>> + if (s->header.cluster_bits < MIN_CLUSTER_BITS ||
>> + s->header.cluster_bits > MAX_CLUSTER_BITS) {
>> + ret = -EINVAL;
>> + goto fail;
>> + }
>> +
>> + s->cluster_size = 1 << s->header.cluster_bits;
>> + if (s->header.header_size != MAX(s->cluster_size, DEFAULT_HEADER_SIZE)) {
>> + char buf[64];
>> + snprintf(buf, sizeof(buf), "Header size: %d",
>
> %u or PRIu32 since header_size is uint32_t. This avoids compiler or
> code scanner warnings.
>
Okay.
>> + s->image_hd = bdrv_new("");
>> + ret = bdrv_open(s->image_hd, image_filename, NULL, flags,
>> + bdrv_find_format(s->header.image_fmt));
>
> Cannot use image_fmt as a string since it is not NUL-terminated.
>
Okay.
>> +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs,
>> + int64_t sector_num,
>> + int remaining_sectors,
>> + QEMUIOVector *qiov)
>> +{
>> + BDRVAddCowState *s = bs->opaque;
>> + int ret = 0, i;
>> + QEMUIOVector hd_qiov;
>> + uint8_t *table;
>> + uint64_t offset;
>> + int mask = s->cluster_sectors - 1;
>> + int cluster_mask = s->cluster_size - 1;
>> +
>> + qemu_co_mutex_lock(&s->lock);
>> + qemu_iovec_init(&hd_qiov, qiov->niov);
>> + ret = bdrv_co_writev(s->image_hd, sector_num,
>> + remaining_sectors, qiov);
>
> All writes are serialized. This means write performance will be very
> poor for multi-threaded workloads.
>
> qcow2 tracks allocating writes and allows them to execute at the same
> time if they do not overlap clusters.
>
Okay, will refer qcow2 related code.
>> +
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + if ((s->header.compat_features & ACOW_F_ALL_ALLOCATED) == 0) {
>> + /* Copy content of unmodified sectors */
>> + if (!is_cluster_head(sector_num, s->cluster_sectors)
>> + && !is_allocated(bs, sector_num)) {
>> + ret = copy_sectors(bs, sector_num & ~mask, sector_num);
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + }
>> +
>> + if (!is_cluster_tail(sector_num + remaining_sectors - 1,
>> + s->cluster_sectors)
>> + && !is_allocated(bs, sector_num + remaining_sectors - 1)) {
>> + ret = copy_sectors(bs, sector_num + remaining_sectors,
>> + ((sector_num + remaining_sectors) | mask) + 1);
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + }
>
> This trashes the cluster when remaining_sectors = 0, sector_num =
> cluster_sectors, and sector cluster_sectors - 1 is unallocated.
>
> Probably best to return early when remaining_sectors == 0.
>
Okay.
>> +
>> + for (i = sector_num / s->cluster_sectors;
>> + i <= (sector_num + remaining_sectors - 1) / s->cluster_sectors;
>> + i++) {
>> + offset = s->header.header_size
>> + + (offset_in_bitmap(i * s->cluster_sectors,
>> + s->cluster_sectors) & (~cluster_mask));
>> + ret = block_cache_get(bs, s->bitmap_cache, offset, (void **)&table);
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + if ((table[i / 8] & (1 << (i % 8))) == 0) {
>> + table[i / 8] |= (1 << (i % 8));
>
> i is based on sector_num while table[] starts at offset, not sector 0.
> The index expression i / 8 leads to out-of-bounds accesses.
>
> I think you forgot to & (s->cluster_sectors - 1).
>
Okay.
>> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
>> +{
>> + BDRVAddCowState *s = bs->opaque;
>> + int ret;
>> +
>> + qemu_co_mutex_lock(&s->lock);
>> + if (s->bitmap_cache) {
>> + ret = block_cache_flush(bs, s->bitmap_cache);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> + }
>> + ret = bdrv_flush(s->image_hd);
>
> This is the wrong way around. We must flush image_hd first so that
> valid data is on disk. Then we can flush bitmap_cache to mark the
> clusters allocated.
>
> Beyond explicit flushes you also need to make sure that image_hd is
> flushed *before* bitmap_cache tables are written out (e.g. cache
> eviction when the cache becomes full). It seems this code is missing.
>
> Also please use bdrv_co_flush() instead of bdrv_flush() in
> add_cow_co_flush() since this is a coroutine function.
>
Okay.
>> diff --git a/block/block-cache.c b/block/block-cache.c
>> index 3544691..4824632 100644
>> --- a/block/block-cache.c
>> +++ b/block/block-cache.c
>> @@ -130,7 +130,7 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
>> } else if (c->table_type == BLOCK_TABLE_L2) {
>> BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
>> } else if (c->table_type == BLOCK_TABLE_BITMAP) {
>> - BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
>> + BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
>> }
>>
>> ret = bdrv_pwrite(bs->file, c->entries[i].offset,
>> @@ -265,7 +265,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
>> if (c->table_type == BLOCK_TABLE_L2) {
>> BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
>> } else if (c->table_type == BLOCK_TABLE_BITMAP) {
>> - BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
>> + BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
>> }
>>
>> ret = bdrv_pread(bs->file, offset, c->entries[i].table,
>
> I commented on this in the previous patch. Please squash this fix into
> the previous patch.
>
Okay.
>> diff --git a/block/qcow2.h b/block/qcow2.h
>> index e7f6aec..a4e514b 100644
>> --- a/block/qcow2.h
>> +++ b/block/qcow2.h
>> @@ -98,6 +98,9 @@ typedef struct QCowSnapshot {
>> uint64_t vm_clock_nsec;
>> } QCowSnapshot;
>>
>> +struct BlockCache;
>> +typedef struct BlockCache BlockCache;
>> +
>> typedef struct Qcow2UnknownHeaderExtension {
>> uint32_t magic;
>> uint32_t len;
>> @@ -389,7 +392,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
>> void qcow2_free_snapshots(BlockDriverState *bs);
>> int qcow2_read_snapshots(BlockDriverState *bs);
>>
>> -/* qcow2-cache.c functions */
>> -
>> -
>
> More qcow2-cache.c move cleanups? Please squash into the previous
> patch.
Okay.
>
>
next prev parent reply other threads:[~2013-04-23 1:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-10 8:11 [Qemu-devel] [PATCH V18 0/6] add-cow file format Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 1/6] docs: document for " Dong Xu Wang
2013-04-18 8:30 ` Stefan Hajnoczi
2013-04-23 1:45 ` Dong Xu Wang
2013-04-26 22:45 ` Eric Blake
2013-05-02 5:44 ` Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 2/6] make path_has_protocol non static Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 3/6] qed_read_string to bdrv_read_string Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang
2013-04-18 8:54 ` Stefan Hajnoczi
2013-04-23 1:47 ` Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 5/6] add-cow file format core code Dong Xu Wang
2013-04-18 10:03 ` Stefan Hajnoczi
2013-04-23 1:54 ` Dong Xu Wang [this message]
2013-05-09 6:24 ` Dong Xu Wang
2013-05-09 15:07 ` Stefan Hajnoczi
2013-05-13 15:07 ` Jeff Cody
2013-05-14 2:15 ` Dong Xu Wang
2013-04-10 8:11 ` [Qemu-devel] [PATCH V18 6/6] qemu-iotests: add add-cow iotests support Dong Xu Wang
2013-04-18 10:06 ` [Qemu-devel] [PATCH V18 0/6] add-cow file format Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5175E9D1.60401@linux.vnet.ibm.com \
--to=wdongxu@linux.vnet.ibm.com \
--cc=kwol@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=wdongxu@cn.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.