From: John Snow <jsnow@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: kwolf@redhat.com, pbonzini@redhat.com,
Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>,
qemu-devel@nongnu.org, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH 2/8] qcow2: add dirty-bitmaps feature
Date: Fri, 12 Jun 2015 15:02:33 -0400 [thread overview]
Message-ID: <557B2CC9.5020309@redhat.com> (raw)
In-Reply-To: <20150610143041.GE2430@stefanha-thinkpad.home>
On 06/10/2015 10:30 AM, Stefan Hajnoczi wrote:
> On Mon, Jun 08, 2015 at 06:21:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>
> I noticed a corner case, it's probably not a problem in practice:
>
> Since the dirty bitmap is stored with the help of a BlockDriverState
> (and its bs->file), it's possible that writing the bitmap will cause
> bits in the bitmap to be dirtied!
>
But since it's metadata and not stored within a disk sector, can this
actually happen? Do you have an example of a scenario where this might
come up?
>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> new file mode 100644
>> index 0000000..bc0167c
>> --- /dev/null
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -0,0 +1,503 @@
>> +/*
>> + * Dirty bitmpas for the QCOW version 2 format
>
> s/bitmpas/bitmaps/
>
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> + BDRVQcowState *s = bs->opaque;
>> + QCowDirtyBitmapHeader h;
>> + QCowDirtyBitmap *bm;
>> + int i, name_size;
>> + int64_t offset;
>> + int ret;
>> +
>> + if (!s->nb_dirty_bitmaps) {
>> + s->dirty_bitmaps = NULL;
>> + s->dirty_bitmaps_size = 0;
>> + return 0;
>> + }
>> +
>> + offset = s->dirty_bitmaps_offset;
>> + s->dirty_bitmaps = g_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
>
> Please use g_try_new0() and handle the NULL return value.
> g_new/g_malloc abort the process if there is not enough memory. When
> opening untrusted image files it is possible that large values will be
> encountered and allocations fail. In that case .bdrv_open() should fail
> instead of killing QEMU.
>
> Using g_try_*() in QEMU is not an exact science but large data buffers
> or allocations where external inputs influence the size are good
> candidates.
>
> Other allocations in these patches should do that too.
>
>> + /* Allocate space for the new dirty bitmap table */
>> + dirty_bitmaps_offset = qcow2_alloc_clusters(bs, dirty_bitmaps_size);
>> + offset = dirty_bitmaps_offset;
>> + if (offset < 0) {
>> + ret = offset;
>> + goto fail;
>> + }
>> + ret = bdrv_flush(bs);
>
> Not sure there is a need for this. The clusters are inaccessible since
> no metadata points to them yet. Therefore we don't need to flush yet
> because there is no risk of seeing an inconsistent state.
>
>> + /* Write all dirty bitmaps to the new table */
>> + for (i = 0; i < s->nb_dirty_bitmaps; i++) {
>> + bm = s->dirty_bitmaps + i;
>> + memset(&h, 0, sizeof(h));
>> + h.l1_table_offset = cpu_to_be64(bm->l1_table_offset);
>> + h.l1_size = cpu_to_be32(bm->l1_size);
>> + h.bitmap_granularity = cpu_to_be32(bm->bitmap_granularity);
>> + h.bitmap_size = cpu_to_be64(bm->bitmap_size);
>> +
>> + name_size = strlen(bm->name);
>> + assert(name_size <= UINT16_MAX);
>> + h.name_size = cpu_to_be16(name_size);
>> + offset = align_offset(offset, 8);
>> +
>> + ret = bdrv_pwrite(bs->file, offset, &h, sizeof(h));
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + offset += sizeof(h);
>> +
>> + ret = bdrv_pwrite(bs->file, offset, bm->name, name_size);
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> + offset += name_size;
>> + }
>
> If files have many thousands of bitmaps then this loop will be slow. It
> would be much faster to write out 1 cluster at a time. This probably
> doesn't matter in practice since this function doesn't get called much
> and normally files will have few bitmaps.
>
>> +
>> + /*
>> + * Update the header extension to point to the new dirty bitmap table. This
>> + * requires the new table and its refcounts to be stable on disk.
>> + */
>> + ret = bdrv_flush(bs);
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> +
>> + s->dirty_bitmaps_offset = dirty_bitmaps_offset;
>> + s->dirty_bitmaps_size = dirty_bitmaps_size;
>> + ret = qcow2_update_header(bs);
>> + if (ret < 0) {
>> + fprintf(stderr, "Could not update qcow2 header\n");
>> + goto fail;
>> + }
>
> qcow2_update_header() does not flush. We need to flush before freeing
> the old clusters in order to guarantee that the file now points to the
> new clusters.
>
>> +uint8_t *qcow2_dirty_bitmap_load(BlockDriverState *bs,
>> + const char *name, uint64_t size,
>> + int granularity)
>> +{
>> + BDRVQcowState *s = bs->opaque;
>> + int i, dirty_bitmap_index, ret;
>> + uint64_t offset;
>> + QCowDirtyBitmap *bm;
>> + uint64_t *l1_table;
>> + uint8_t *buf;
>> +
>> + dirty_bitmap_index = find_dirty_bitmap_by_name(bs, name);
>> + if (dirty_bitmap_index < 0) {
>> + return NULL;
>> + }
>> + bm = &s->dirty_bitmaps[dirty_bitmap_index];
>> +
>> + if (size != bm->bitmap_size || granularity != bm->bitmap_granularity) {
>> + return NULL;
>> + }
>> +
>> + l1_table = g_malloc(bm->l1_size * sizeof(uint64_t));
>
> Please use g_try_malloc() with NULL handling.
>
>> + ret = bdrv_pread(bs->file, bm->l1_table_offset, l1_table,
>> + bm->l1_size * sizeof(uint64_t));
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> +
>> + buf = g_malloc0(bm->l1_size * s->cluster_size);
>
> What is the maximum l1_size value? cluster_size and l1_size are 32-bit
> so with 64 KB cluster_size this overflows if l1_size > 65535. Do you
> want to cast to size_t?
>
>> + for (i = 0; i < bm->l1_size; ++i) {
>> + offset = be64_to_cpu(l1_table[i]);
>> + if (!(offset & 1)) {
>
> This doesn't honor the 0 offset means unallocated cluster behavior for
> the Standard Cluster Descriptor from the qcow2 specification.
>
>> + ret = bdrv_pread(bs->file, offset, buf + i * s->cluster_size,
>> + s->cluster_size);
>> + if (ret < 0) {
>> + goto fail;
>
> Missing g_free(buf)
>
>> + l1_table = g_try_new(uint64_t, bm->l1_size);
>> + if (l1_table == NULL) {
>> + ret = -ENOMEM;
>> + goto fail;
>> + }
>> +
>> + /* initialize with zero clusters */
>> + for (i = 0; i < s->l1_size; i++) {
>> + l1_table[i] = cpu_to_be64(1);
>> + }
>> +
>> + ret = qcow2_pre_write_overlap_check(bs, 0, bm->l1_table_offset,
>> + s->l1_size * sizeof(uint64_t));
>> + if (ret < 0) {
>> + goto fail;
>> + }
>> +
>> + ret = bdrv_pwrite(bs->file, bm->l1_table_offset, l1_table,
>> + s->l1_size * sizeof(uint64_t));
>> + if (ret < 0) {
>> + goto fail;
>> + }
>
> Flush is needed here to ensure the bitmap has reached disk before the
> dirty_bitmaps array is written out.
>
>> +
>> + g_free(l1_table);
>> + l1_table = NULL;
>> +
>> + /* Append the new dirty bitmap to the dirty bitmap list */
>> + new_dirty_bitmap_list = g_new(QCowDirtyBitmap, s->nb_dirty_bitmaps + 1);
>> + if (s->dirty_bitmaps) {
>> + memcpy(new_dirty_bitmap_list, s->dirty_bitmaps,
>> + s->nb_dirty_bitmaps * sizeof(QCowDirtyBitmap));
>> + old_dirty_bitmap_list = s->dirty_bitmaps;
>> + }
>> + s->dirty_bitmaps = new_dirty_bitmap_list;
>> + s->dirty_bitmaps[s->nb_dirty_bitmaps++] = *bm;
>> +
>> + ret = qcow2_write_dirty_bitmaps(bs);
>> + if (ret < 0) {
>> + g_free(s->dirty_bitmaps);
>> + s->dirty_bitmaps = old_dirty_bitmap_list;
>> + s->nb_dirty_bitmaps--;
>> + goto fail;
>> + }
>> +
>> + g_free(old_dirty_bitmap_list);
>> +
>> + return 0;
>> +
>> +fail:
>> + g_free(bm->name);
>> + g_free(l1_table);
>> +
>
> The l1_table clusters should be freed on failure.
>
next prev parent reply other threads:[~2015-06-12 19:02 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 15:21 [Qemu-devel] [PATCH v2 RFC 0/8] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 1/8] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
2015-06-09 16:01 ` John Snow
2015-06-09 17:03 ` Stefan Hajnoczi
2015-06-10 8:19 ` Vladimir Sementsov-Ogievskiy
2015-06-10 8:49 ` Vladimir Sementsov-Ogievskiy
2015-06-10 13:00 ` Eric Blake
2015-06-11 10:16 ` Vladimir Sementsov-Ogievskiy
2015-06-10 13:24 ` Stefan Hajnoczi
2015-06-11 10:19 ` Vladimir Sementsov-Ogievskiy
2015-06-11 13:03 ` Stefan Hajnoczi
2015-06-11 16:21 ` John Snow
2015-06-12 10:28 ` Stefan Hajnoczi
2015-06-12 15:19 ` John Snow
2015-06-10 15:34 ` Kevin Wolf
2015-06-11 10:25 ` Vladimir Sementsov-Ogievskiy
2015-06-11 16:30 ` John Snow
2015-06-12 8:33 ` Kevin Wolf
2015-08-24 10:46 ` Vladimir Sementsov-Ogievskiy
2015-08-24 13:30 ` Vladimir Sementsov-Ogievskiy
2015-08-24 14:08 ` Vladimir Sementsov-Ogievskiy
2015-08-24 14:04 ` Vladimir Sementsov-Ogievskiy
2015-08-31 22:21 ` Eric Blake
2015-08-31 22:24 ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 2/8] qcow2: add dirty-bitmaps feature Vladimir Sementsov-Ogievskiy
2015-06-09 16:52 ` Stefan Hajnoczi
2015-06-10 14:30 ` Stefan Hajnoczi
2015-06-12 19:02 ` John Snow [this message]
2015-06-15 14:42 ` Stefan Hajnoczi
2015-06-23 17:57 ` John Snow
2015-06-24 9:39 ` Stefan Hajnoczi
2015-08-14 17:14 ` Vladimir Sementsov-Ogievskiy
2015-08-26 9:09 ` Stefan Hajnoczi
2015-06-11 23:04 ` John Snow
2015-06-15 14:05 ` Vladimir Sementsov-Ogievskiy
2015-06-15 16:53 ` John Snow
2015-06-12 21:55 ` John Snow
2015-08-26 13:15 ` Vladimir Sementsov-Ogievskiy
2015-08-26 14:14 ` Vladimir Sementsov-Ogievskiy
2015-08-27 12:43 ` Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 3/8] block: store persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 4/8] block: add bdrv_load_dirty_bitmap Vladimir Sementsov-Ogievskiy
2015-06-09 16:01 ` Stefan Hajnoczi
2015-06-10 22:33 ` John Snow
2015-06-11 10:41 ` Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 5/8] qcow2: add qcow2_dirty_bitmap_delete_all Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 6/8] qcow2: add autoclear bit for dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-09 15:49 ` Stefan Hajnoczi
2015-06-09 15:50 ` Stefan Hajnoczi
2015-08-27 7:45 ` Vladimir Sementsov-Ogievskiy
2015-08-31 11:06 ` Vladimir Sementsov-Ogievskiy
2015-08-31 22:39 ` Eric Blake
2015-08-31 22:50 ` Eric Blake
2015-06-10 23:42 ` John Snow
2015-06-11 8:35 ` Kevin Wolf
2015-06-11 10:49 ` Vladimir Sementsov-Ogievskiy
2015-06-11 16:36 ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 7/8] qemu: command line option " Vladimir Sementsov-Ogievskiy
2015-06-11 20:57 ` John Snow
2015-06-12 21:49 ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 8/8] iotests: test internal persistent dirty bitmap Vladimir Sementsov-Ogievskiy
2015-06-09 16:17 ` Eric Blake
2015-06-10 15:27 ` [Qemu-devel] [PATCH v2 RFC 0/8] block: persistent dirty bitmaps Stefan Hajnoczi
2015-06-11 11:22 ` Vladimir Sementsov-Ogievskiy
2015-06-11 13:14 ` Stefan Hajnoczi
2015-06-11 20:06 ` Stefan Hajnoczi
2015-06-12 9:58 ` Denis V. Lunev
2015-06-12 10:36 ` Stefan Hajnoczi
2015-08-26 6:26 ` Vladimir Sementsov-Ogievskiy
2015-08-26 9:13 ` Stefan Hajnoczi
2015-06-12 19:34 ` John Snow
2015-06-17 14:29 ` Vladimir Sementsov-Ogievskiy
2015-06-24 0:21 ` John Snow
2015-07-08 12:24 ` Vladimir Sementsov-Ogievskiy
2015-07-08 15:21 ` John Snow
2015-08-27 10:08 ` Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=557B2CC9.5020309@redhat.com \
--to=jsnow@redhat.com \
--cc=den@openvz.org \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@parallels.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.