From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=41302 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PhNn5-0005nl-HI
	for Qemu-devel@nongnu.org; Mon, 24 Jan 2011 09:53:13 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1PhNn3-000703-Q0
	for Qemu-devel@nongnu.org; Mon, 24 Jan 2011 09:53:11 -0500
Received: from mx1.redhat.com ([209.132.183.28]:48407)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1PhNn3-0006zT-Hr
	for Qemu-devel@nongnu.org; Mon, 24 Jan 2011 09:53:09 -0500
Message-ID: <4D3D92B0.3060201@redhat.com>
Date: Mon, 24 Jan 2011 15:54:40 +0100
From: Kevin Wolf <kwolf@redhat.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache
References: <1295543412-24012-1-git-send-email-kwolf@redhat.com>	<1295543412-24012-3-git-send-email-kwolf@redhat.com>
	<AANLkTimeRQVVeNnRbaiRNQ-SbExyJDZiuteBP7wfv9He@mail.gmail.com>
In-Reply-To: <AANLkTimeRQVVeNnRbaiRNQ-SbExyJDZiuteBP7wfv9He@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Qemu-devel@nongnu.org

[ Re-adding qemu-devel to CC ]

Am 24.01.2011 15:34, schrieb Stefan Hajnoczi:
> On Thu, Jan 20, 2011 at 5:10 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> @@ -702,17 +622,30 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
>>
>>     if (m->nb_available & (s->cluster_sectors - 1)) {
>>         uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 1);
>> +        cow = true;
>>         ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 9),
>>                 m->nb_available - end, s->cluster_sectors);
>>         if (ret < 0)
>>             goto err;
>>     }
>>
>> -    /* update L2 table */
>> +    /*
>> +     * Update L2 table.
>> +     *
>> +     * Before we update the L2 table to actually point to the new cluster, we
>> +     * need to be sure that the refcounts have been increased and COW was
>> +     * handled.
>> +     */
>> +    if (cow) {
>> +        bdrv_flush(bs->file);
> 
> Just bdrv_flush(bs->file) and not a refcounts cache flush?

Refcounts and data need not to be ordered against each other. They only
must both be on disk when we write the L2 table.

>> +    }
>> +
>> +    qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
>>     ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index);
>>     if (ret < 0) {
>>         goto err;
>>     }
>> +    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>>
>>     for (i = 0; i < m->nb_clusters; i++) {
>>         /* if two concurrent writes happen to the same unallocated cluster
>> @@ -728,16 +661,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
>>                     (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
>>      }
>>
>> -    /*
>> -     * Before we update the L2 table to actually point to the new cluster, we
>> -     * need to be sure that the refcounts have been increased and COW was
>> -     * handled.
>> -     */
>> -    bdrv_flush(bs->file);
>>
>> -    ret = write_l2_entries(bs, l2_table, l2_offset, l2_index, m->nb_clusters);
>> +    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>>     if (ret < 0) {
>> -        qcow2_l2_cache_reset(bs);
>>         goto err;
>>     }
>>
> 
> The function continues like this:
> 
> /*
>  * If this was a COW, we need to decrease the refcount of the old cluster.
>  * Also flush bs->file to get the right order for L2 and refcount update.
>  */
> if (j != 0) {
>     bdrv_flush(bs->file);
>     for (i = 0; i < j; i++) {
>         qcow2_free_any_clusters(bs,
>             be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
>     }
> }
> 
> Does bdrv_flush(bs->file) "get the right order for L2 and refcount
> update"?  We've just put an L2 table, should this be an L2 table
> flush?

I agree, this looks wrong. What we need is a dependency to ensure that
first L2 is flushed and then the refcount block.
qcow2_free_any_clusters() calls update_refcount() indirectly, which
takes care of setting this dependency.

So in the end I think it's just an unnecessary bdrv_flush. Makes sense?

Kevin