qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: kwolf@redhat.com, pbonzini@redhat.com,
	Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>,
	qemu-devel@nongnu.org, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH 2/8] qcow2: add dirty-bitmaps feature
Date: Fri, 12 Jun 2015 15:02:33 -0400	[thread overview]
Message-ID: <557B2CC9.5020309@redhat.com> (raw)
In-Reply-To: <20150610143041.GE2430@stefanha-thinkpad.home>



On 06/10/2015 10:30 AM, Stefan Hajnoczi wrote:
> On Mon, Jun 08, 2015 at 06:21:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 
> I noticed a corner case, it's probably not a problem in practice:
> 
> Since the dirty bitmap is stored with the help of a BlockDriverState
> (and its bs->file), it's possible that writing the bitmap will cause
> bits in the bitmap to be dirtied!
> 

But since it's metadata and not stored within a disk sector, can this
actually happen? Do you have an example of a scenario where this might
come up?

>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> new file mode 100644
>> index 0000000..bc0167c
>> --- /dev/null
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -0,0 +1,503 @@
>> +/*
>> + * Dirty bitmpas for the QCOW version 2 format
> 
> s/bitmpas/bitmaps/
> 
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
>> +    QCowDirtyBitmapHeader h;
>> +    QCowDirtyBitmap *bm;
>> +    int i, name_size;
>> +    int64_t offset;
>> +    int ret;
>> +
>> +    if (!s->nb_dirty_bitmaps) {
>> +        s->dirty_bitmaps = NULL;
>> +        s->dirty_bitmaps_size = 0;
>> +        return 0;
>> +    }
>> +
>> +    offset = s->dirty_bitmaps_offset;
>> +    s->dirty_bitmaps = g_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
> 
> Please use g_try_new0() and handle the NULL return value.
> g_new/g_malloc abort the process if there is not enough memory.  When
> opening untrusted image files it is possible that large values will be
> encountered and allocations fail.  In that case .bdrv_open() should fail
> instead of killing QEMU.
> 
> Using g_try_*() in QEMU is not an exact science but large data buffers
> or allocations where external inputs influence the size are good
> candidates.
> 
> Other allocations in these patches should do that too.
> 
>> +    /* Allocate space for the new dirty bitmap table */
>> +    dirty_bitmaps_offset = qcow2_alloc_clusters(bs, dirty_bitmaps_size);
>> +    offset = dirty_bitmaps_offset;
>> +    if (offset < 0) {
>> +        ret = offset;
>> +        goto fail;
>> +    }
>> +    ret = bdrv_flush(bs);
> 
> Not sure there is a need for this.  The clusters are inaccessible since
> no metadata points to them yet.  Therefore we don't need to flush yet
> because there is no risk of seeing an inconsistent state.
> 
>> +    /* Write all dirty bitmaps to the new table */
>> +    for (i = 0; i < s->nb_dirty_bitmaps; i++) {
>> +        bm = s->dirty_bitmaps + i;
>> +        memset(&h, 0, sizeof(h));
>> +        h.l1_table_offset = cpu_to_be64(bm->l1_table_offset);
>> +        h.l1_size = cpu_to_be32(bm->l1_size);
>> +        h.bitmap_granularity = cpu_to_be32(bm->bitmap_granularity);
>> +        h.bitmap_size = cpu_to_be64(bm->bitmap_size);
>> +
>> +        name_size = strlen(bm->name);
>> +        assert(name_size <= UINT16_MAX);
>> +        h.name_size = cpu_to_be16(name_size);
>> +        offset = align_offset(offset, 8);
>> +
>> +        ret = bdrv_pwrite(bs->file, offset, &h, sizeof(h));
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +        offset += sizeof(h);
>> +
>> +        ret = bdrv_pwrite(bs->file, offset, bm->name, name_size);
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +        offset += name_size;
>> +    }
> 
> If files have many thousands of bitmaps then this loop will be slow.  It
> would be much faster to write out 1 cluster at a time.  This probably
> doesn't matter in practice since this function doesn't get called much
> and normally files will have few bitmaps.
> 
>> +
>> +    /*
>> +     * Update the header extension to point to the new dirty bitmap table. This
>> +     * requires the new table and its refcounts to be stable on disk.
>> +     */
>> +    ret = bdrv_flush(bs);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    s->dirty_bitmaps_offset = dirty_bitmaps_offset;
>> +    s->dirty_bitmaps_size = dirty_bitmaps_size;
>> +    ret = qcow2_update_header(bs);
>> +    if (ret < 0) {
>> +        fprintf(stderr, "Could not update qcow2 header\n");
>> +        goto fail;
>> +    }
> 
> qcow2_update_header() does not flush.  We need to flush before freeing
> the old clusters in order to guarantee that the file now points to the
> new clusters.
> 
>> +uint8_t *qcow2_dirty_bitmap_load(BlockDriverState *bs,
>> +                            const char *name, uint64_t size,
>> +                            int granularity)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
>> +    int i, dirty_bitmap_index, ret;
>> +    uint64_t offset;
>> +    QCowDirtyBitmap *bm;
>> +    uint64_t *l1_table;
>> +    uint8_t *buf;
>> +
>> +    dirty_bitmap_index = find_dirty_bitmap_by_name(bs, name);
>> +    if (dirty_bitmap_index < 0) {
>> +        return NULL;
>> +    }
>> +    bm = &s->dirty_bitmaps[dirty_bitmap_index];
>> +
>> +    if (size != bm->bitmap_size || granularity != bm->bitmap_granularity) {
>> +        return NULL;
>> +    }
>> +
>> +    l1_table = g_malloc(bm->l1_size * sizeof(uint64_t));
> 
> Please use g_try_malloc() with NULL handling.
> 
>> +    ret = bdrv_pread(bs->file, bm->l1_table_offset, l1_table,
>> +                     bm->l1_size * sizeof(uint64_t));
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    buf = g_malloc0(bm->l1_size * s->cluster_size);
> 
> What is the maximum l1_size value?  cluster_size and l1_size are 32-bit
> so with 64 KB cluster_size this overflows if l1_size > 65535.  Do you
> want to cast to size_t?
> 
>> +    for (i = 0; i < bm->l1_size; ++i) {
>> +        offset = be64_to_cpu(l1_table[i]);
>> +        if (!(offset & 1)) {
> 
> This doesn't honor the 0 offset means unallocated cluster behavior for
> the Standard Cluster Descriptor from the qcow2 specification.
> 
>> +            ret = bdrv_pread(bs->file, offset, buf + i * s->cluster_size,
>> +                             s->cluster_size);
>> +            if (ret < 0) {
>> +                goto fail;
> 
> Missing g_free(buf)
> 
>> +    l1_table = g_try_new(uint64_t, bm->l1_size);
>> +    if (l1_table == NULL) {
>> +        ret = -ENOMEM;
>> +        goto fail;
>> +    }
>> +
>> +    /* initialize with zero clusters */
>> +    for (i = 0; i < s->l1_size; i++) {
>> +        l1_table[i] = cpu_to_be64(1);
>> +    }
>> +
>> +    ret = qcow2_pre_write_overlap_check(bs, 0, bm->l1_table_offset,
>> +                                        s->l1_size * sizeof(uint64_t));
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    ret = bdrv_pwrite(bs->file, bm->l1_table_offset, l1_table,
>> +                      s->l1_size * sizeof(uint64_t));
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
> 
> Flush is needed here to ensure the bitmap has reached disk before the
> dirty_bitmaps array is written out.
> 
>> +
>> +    g_free(l1_table);
>> +    l1_table = NULL;
>> +
>> +    /* Append the new dirty bitmap to the dirty bitmap list */
>> +    new_dirty_bitmap_list = g_new(QCowDirtyBitmap, s->nb_dirty_bitmaps + 1);
>> +    if (s->dirty_bitmaps) {
>> +        memcpy(new_dirty_bitmap_list, s->dirty_bitmaps,
>> +               s->nb_dirty_bitmaps * sizeof(QCowDirtyBitmap));
>> +        old_dirty_bitmap_list = s->dirty_bitmaps;
>> +    }
>> +    s->dirty_bitmaps = new_dirty_bitmap_list;
>> +    s->dirty_bitmaps[s->nb_dirty_bitmaps++] = *bm;
>> +
>> +    ret = qcow2_write_dirty_bitmaps(bs);
>> +    if (ret < 0) {
>> +        g_free(s->dirty_bitmaps);
>> +        s->dirty_bitmaps = old_dirty_bitmap_list;
>> +        s->nb_dirty_bitmaps--;
>> +        goto fail;
>> +    }
>> +
>> +    g_free(old_dirty_bitmap_list);
>> +
>> +    return 0;
>> +
>> +fail:
>> +    g_free(bm->name);
>> +    g_free(l1_table);
>> +
> 
> The l1_table clusters should be freed on failure.
> 

  reply	other threads:[~2015-06-12 19:02 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 15:21 [Qemu-devel] [PATCH v2 RFC 0/8] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 1/8] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
2015-06-09 16:01   ` John Snow
2015-06-09 17:03   ` Stefan Hajnoczi
2015-06-10  8:19     ` Vladimir Sementsov-Ogievskiy
2015-06-10  8:49       ` Vladimir Sementsov-Ogievskiy
2015-06-10 13:00       ` Eric Blake
2015-06-11 10:16         ` Vladimir Sementsov-Ogievskiy
2015-06-10 13:24       ` Stefan Hajnoczi
2015-06-11 10:19         ` Vladimir Sementsov-Ogievskiy
2015-06-11 13:03           ` Stefan Hajnoczi
2015-06-11 16:21             ` John Snow
2015-06-12 10:28               ` Stefan Hajnoczi
2015-06-12 15:19                 ` John Snow
2015-06-10 15:34   ` Kevin Wolf
2015-06-11 10:25     ` Vladimir Sementsov-Ogievskiy
2015-06-11 16:30       ` John Snow
2015-06-12  8:33         ` Kevin Wolf
2015-08-24 10:46     ` Vladimir Sementsov-Ogievskiy
2015-08-24 13:30   ` Vladimir Sementsov-Ogievskiy
2015-08-24 14:08     ` Vladimir Sementsov-Ogievskiy
2015-08-24 14:04   ` Vladimir Sementsov-Ogievskiy
2015-08-31 22:21   ` Eric Blake
2015-08-31 22:24     ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 2/8] qcow2: add dirty-bitmaps feature Vladimir Sementsov-Ogievskiy
2015-06-09 16:52   ` Stefan Hajnoczi
2015-06-10 14:30   ` Stefan Hajnoczi
2015-06-12 19:02     ` John Snow [this message]
2015-06-15 14:42       ` Stefan Hajnoczi
2015-06-23 17:57         ` John Snow
2015-06-24  9:39           ` Stefan Hajnoczi
2015-08-14 17:14     ` Vladimir Sementsov-Ogievskiy
2015-08-26  9:09       ` Stefan Hajnoczi
2015-06-11 23:04   ` John Snow
2015-06-15 14:05     ` Vladimir Sementsov-Ogievskiy
2015-06-15 16:53       ` John Snow
2015-06-12 21:55   ` John Snow
2015-08-26 13:15     ` Vladimir Sementsov-Ogievskiy
2015-08-26 14:14       ` Vladimir Sementsov-Ogievskiy
2015-08-27 12:43   ` Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 3/8] block: store persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 4/8] block: add bdrv_load_dirty_bitmap Vladimir Sementsov-Ogievskiy
2015-06-09 16:01   ` Stefan Hajnoczi
2015-06-10 22:33     ` John Snow
2015-06-11 10:41       ` Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 5/8] qcow2: add qcow2_dirty_bitmap_delete_all Vladimir Sementsov-Ogievskiy
2015-06-08 15:21 ` [Qemu-devel] [PATCH 6/8] qcow2: add autoclear bit for dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-06-09 15:49   ` Stefan Hajnoczi
2015-06-09 15:50   ` Stefan Hajnoczi
2015-08-27  7:45     ` Vladimir Sementsov-Ogievskiy
2015-08-31 11:06       ` Vladimir Sementsov-Ogievskiy
2015-08-31 22:39       ` Eric Blake
2015-08-31 22:50         ` Eric Blake
2015-06-10 23:42   ` John Snow
2015-06-11  8:35     ` Kevin Wolf
2015-06-11 10:49     ` Vladimir Sementsov-Ogievskiy
2015-06-11 16:36       ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 7/8] qemu: command line option " Vladimir Sementsov-Ogievskiy
2015-06-11 20:57   ` John Snow
2015-06-12 21:49   ` John Snow
2015-06-08 15:21 ` [Qemu-devel] [PATCH 8/8] iotests: test internal persistent dirty bitmap Vladimir Sementsov-Ogievskiy
2015-06-09 16:17   ` Eric Blake
2015-06-10 15:27 ` [Qemu-devel] [PATCH v2 RFC 0/8] block: persistent dirty bitmaps Stefan Hajnoczi
2015-06-11 11:22   ` Vladimir Sementsov-Ogievskiy
2015-06-11 13:14     ` Stefan Hajnoczi
2015-06-11 20:06 ` Stefan Hajnoczi
2015-06-12  9:58   ` Denis V. Lunev
2015-06-12 10:36     ` Stefan Hajnoczi
2015-08-26  6:26       ` Vladimir Sementsov-Ogievskiy
2015-08-26  9:13         ` Stefan Hajnoczi
2015-06-12 19:34 ` John Snow
2015-06-17 14:29   ` Vladimir Sementsov-Ogievskiy
2015-06-24  0:21     ` John Snow
2015-07-08 12:24       ` Vladimir Sementsov-Ogievskiy
2015-07-08 15:21         ` John Snow
2015-08-27 10:08       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=557B2CC9.5020309@redhat.com \
    --to=jsnow@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@parallels.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).