qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache
Date: Mon, 24 Jan 2011 16:36:50 +0100	[thread overview]
Message-ID: <4D3D9C92.2040609@redhat.com> (raw)
In-Reply-To: <AANLkTi=cSZwrdb814MceqkkULoAd0yyh=TRQK_nobBjw@mail.gmail.com>

Am 24.01.2011 16:26, schrieb Stefan Hajnoczi:
> On Mon, Jan 24, 2011 at 2:54 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> [ Re-adding qemu-devel to CC ]
>>
>> Am 24.01.2011 15:34, schrieb Stefan Hajnoczi:
>>> On Thu, Jan 20, 2011 at 5:10 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>>> @@ -702,17 +622,30 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
>>>>
>>>>     if (m->nb_available & (s->cluster_sectors - 1)) {
>>>>         uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 1);
>>>> +        cow = true;
>>>>         ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 9),
>>>>                 m->nb_available - end, s->cluster_sectors);
>>>>         if (ret < 0)
>>>>             goto err;
>>>>     }
>>>>
>>>> -    /* update L2 table */
>>>> +    /*
>>>> +     * Update L2 table.
>>>> +     *
>>>> +     * Before we update the L2 table to actually point to the new cluster, we
>>>> +     * need to be sure that the refcounts have been increased and COW was
>>>> +     * handled.
>>>> +     */
>>>> +    if (cow) {
>>>> +        bdrv_flush(bs->file);
>>>
>>> Just bdrv_flush(bs->file) and not a refcounts cache flush?
>>
>> Refcounts and data need not to be ordered against each other. They only
>> must both be on disk when we write the L2 table.
> 
> Have I missed where refcounts actually get flushed from the cache out
> to the disk because bdrv_flush(bs->file) only syncs the file but
> doesn't write out dirty data held in cache.

The bdrv_flush isn't supposed to flush the refcounts, but the data
copied during COW (what you call pre/postfill in QED)

The refcounts are handled by the qcow2_cache_set_dependency below, so
that before writing the L2 tables we always write the refcounts first.

>>>> +    }
>>>> +
>>>> +    qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
>>>>     ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index);
>>>>     if (ret < 0) {
>>>>         goto err;
>>>>     }
>>>> +    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>>>>
>>>>     for (i = 0; i < m->nb_clusters; i++) {
>>>>         /* if two concurrent writes happen to the same unallocated cluster
>>>> @@ -728,16 +661,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
>>>>                     (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
>>>>      }
>>>>
>>>> -    /*
>>>> -     * Before we update the L2 table to actually point to the new cluster, we
>>>> -     * need to be sure that the refcounts have been increased and COW was
>>>> -     * handled.
>>>> -     */
>>>> -    bdrv_flush(bs->file);
>>>>
>>>> -    ret = write_l2_entries(bs, l2_table, l2_offset, l2_index, m->nb_clusters);
>>>> +    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>>>>     if (ret < 0) {
>>>> -        qcow2_l2_cache_reset(bs);
>>>>         goto err;
>>>>     }
>>>>
>>>
>>> The function continues like this:
>>>
>>> /*
>>>  * If this was a COW, we need to decrease the refcount of the old cluster.
>>>  * Also flush bs->file to get the right order for L2 and refcount update.
>>>  */
>>> if (j != 0) {
>>>     bdrv_flush(bs->file);
>>>     for (i = 0; i < j; i++) {
>>>         qcow2_free_any_clusters(bs,
>>>             be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
>>>     }
>>> }
>>>
>>> Does bdrv_flush(bs->file) "get the right order for L2 and refcount
>>> update"?  We've just put an L2 table, should this be an L2 table
>>> flush?
>>
>> I agree, this looks wrong. What we need is a dependency to ensure that
>> first L2 is flushed and then the refcount block.
>> qcow2_free_any_clusters() calls update_refcount() indirectly, which
>> takes care of setting this dependency.
>>
>> So in the end I think it's just an unnecessary bdrv_flush. Makes sense?
> 
> I don't understand this fully.  I've noticed that the cache isn't the
> only mechanism for making changes to tables - there are functions like
> write_l2_entries() that directly write out parts of tables without
> honoring dependencies or using the dirty bit on the cache entry.  I
> probably need to look at this more carefully to fully understand
> whether or not it is correct.

No, the cache should be the only mechanism that is used for accessing L2
tables and refcount blocks. write_l2_entries() is an old function that
is removed by the patch.

Direct accesses should only be left for L1 tables and refcount tables.

Kevin

  reply	other threads:[~2011-01-24 15:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-20 17:10 [Qemu-devel] [PATCH v4 0/3] qcow2 metadata cache Kevin Wolf
2011-01-20 17:10 ` [Qemu-devel] [PATCH v4 1/3] qcow2: Add QcowCache Kevin Wolf
2011-01-24 14:00   ` Stefan Hajnoczi
2011-01-20 17:10 ` [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache Kevin Wolf
     [not found]   ` <AANLkTimeRQVVeNnRbaiRNQ-SbExyJDZiuteBP7wfv9He@mail.gmail.com>
2011-01-24 14:54     ` Kevin Wolf
2011-01-24 15:26       ` Stefan Hajnoczi
2011-01-24 15:36         ` Kevin Wolf [this message]
2011-01-24 15:39           ` Stefan Hajnoczi
2011-02-09 11:19   ` Avi Kivity
2011-02-09 11:23     ` Avi Kivity
2011-01-20 17:10 ` [Qemu-devel] [PATCH v4 3/3] qcow2: Batch flushes for COW Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D3D9C92.2040609@redhat.com \
    --to=kwolf@redhat.com \
    --cc=Qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).