From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KyOQo-00031J-CP
	for qemu-devel@nongnu.org; Fri, 07 Nov 2008 05:19:10 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KyOQk-00030V-Ih
	for qemu-devel@nongnu.org; Fri, 07 Nov 2008 05:19:08 -0500
Received: from [199.232.76.173] (port=47094 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KyOQj-00030L-6B
	for qemu-devel@nongnu.org; Fri, 07 Nov 2008 05:19:05 -0500
Received: from cantor2.suse.de ([195.135.220.15]:34961 helo=mx2.suse.de)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
	(envelope-from <kwolf@suse.de>) id 1KyOQj-0004Yv-AL
	for qemu-devel@nongnu.org; Fri, 07 Nov 2008 05:19:05 -0500
Message-ID: <491416BF.8070007@suse.de>
Date: Fri, 07 Nov 2008 11:21:51 +0100
From: Kevin Wolf <kwolf@suse.de>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [PATCH 2/4] qcow2: Allow	update_cluster_refcount()
	to	update several clusters refcount.
References: <20081106165212.380421945@bull.net>	
	<1225990558.6576.12.camel@frecb07144> <4913336E.3060309@suse.de>
	<1226052194.4046.21.camel@frecb07144>
In-Reply-To: <1226052194.4046.21.camel@frecb07144>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: qemu-devel@nongnu.org

Laurent Vivier schrieb:
>>> +        block_index = cluster_index & (refcount_cache_size - 1);
>>> +        refcount = 0;
>>> +        while (nb_clusters &&
>>> +               block_index + nb_block_index < refcount_cache_size) {
>>> +
>>> +            refcount = be16_to_cpu(
>>> +                         s->refcount_block_cache[block_index + nb_block_index]);
>>> +            refcount += addend;
>>> +            if (refcount < 0 || refcount > 0xffff)
>>> +                return -EINVAL;
>> Here we possibly abort in the middle of the operation. If it fails
>> somewhere in the fifth refcount block, what happens with the four
>> already updated blocks?
> 
> Yes, you're right. I think we have at least to save refcount we have
> updated.
> 
>>> +            if (refcount == 0 &&
>>> +                cluster_index + nb_block_index < s->free_cluster_index) {
>>> +                s->free_cluster_index = cluster_index + nb_block_index;
>>> +            }
>>> +            s->refcount_block_cache[block_index + nb_block_index] =
>>> +                                                          cpu_to_be16(refcount);
>>> +            nb_block_index++;
>>> +            nb_clusters--;
>>> +        }
>>> +        if (bdrv_pwrite(s->hd,
>>> +                        refcount_block_offset + (block_index << REFCOUNT_SHIFT),
>>> +                        s->refcount_block_cache + block_index,
>>> +                        nb_block_index * sizeof(uint16_t)) !=
>>> +                        nb_block_index * sizeof(uint16_t))
>>>              return -EIO;
>> Same here.
> 
> I think we are not worst than the original behavior. I don't know how to
> manage this better.

Right, this hasn't been optimal before either. I noticed later that
update_refcount didn't even check the return value. So the difference is
that you abort while the old implementation tries the next block.

I'm unsure which behaviour is the better one. Actually, both don't feel
quite right. But being not worse in this aspect than the original still
makes this patch an improvement, so...

>>>      }
>>> -    /* we can update the count and save it */
>>> -    block_index = cluster_index &
>>> -        ((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1);
>>> -    refcount = be16_to_cpu(s->refcount_block_cache[block_index]);
>>> -    refcount += addend;
>>> -    if (refcount < 0 || refcount > 0xffff)
>>> -        return -EINVAL;
>>> -    if (refcount == 0 && cluster_index < s->free_cluster_index) {
>>> -        s->free_cluster_index = cluster_index;
>>> -    }
>>> -    s->refcount_block_cache[block_index] = cpu_to_be16(refcount);
>>> -    if (bdrv_pwrite(s->hd,
>>> -                    refcount_block_offset + (block_index << REFCOUNT_SHIFT),
>>> -                    &s->refcount_block_cache[block_index], 2) != 2)
>>> -        return -EIO;
>>>      return refcount;
>>>  }
>>>  
>>> @@ -2437,7 +2469,7 @@ static void update_refcount(BlockDriverS
>>>                              int addend)
>>>  {
>>>      BDRVQcowState *s = bs->opaque;
>>> -    int64_t start, last, cluster_offset;
>>> +    int64_t start, last;
>>>  
>>>  #ifdef DEBUG_ALLOC2
>>>      printf("update_refcount: offset=%lld size=%lld addend=%d\n",
>>> @@ -2445,12 +2477,9 @@ static void update_refcount(BlockDriverS
>>>  #endif
>>>      if (length <= 0)
>>>          return;
>>> -    start = offset & ~(s->cluster_size - 1);
>>> -    last = (offset + length - 1) & ~(s->cluster_size - 1);
>>> -    for(cluster_offset = start; cluster_offset <= last;
>>> -        cluster_offset += s->cluster_size) {
>>> -        update_cluster_refcount(bs, cluster_offset >> s->cluster_bits, addend);
>>> -    }
>>> +    start = offset >> s->cluster_bits;
>>> +    last = (offset + length) >> s->cluster_bits;
>> Off by one for length % cluster_size == 0?
> 
> Explain, please (but notice the "<=" in the "for (...)").

I didn't even look at the old code yesterday, but the difference is that
you dropped the - 1 for last.

Let's do it by example: Assume cluster_size = 0x1000 for simplicity and
we get offset = 0x1000 and length = 0x2000 (i.e. the last affected byte
would be 0x2fff). The old code correctly produces start = 0x1000 and
last = 0x2000 whereas you get start = 1 and last = 3.

Kevin