From: Charlie Shepherd <charlie@ctshepherd.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kwolf@redhat.com, stefanha@gmail.com, gabriel@kerneis.info,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/2] Make cow_co_is_allocated and cow_update_bitmap more efficient
Date: Tue, 20 Aug 2013 23:53:48 +0100 [thread overview]
Message-ID: <5213F37C.8090300@ctshepherd.com> (raw)
In-Reply-To: <5213D628.4030409@redhat.com>
On 20/08/2013 21:48, Paolo Bonzini wrote:
> Il 20/08/2013 20:34, Charlie Shepherd ha scritto:
>> /* Return true if first block has been changed (ie. current version is
>> @@ -146,40 +114,82 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>> static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>> int64_t sector_num, int nb_sectors, int *num_same)
>> {
>> - int changed;
>> + int ret, changed;
>> + uint64_t offset = sizeof(struct cow_header_v2) + sector_num / 8;
>> +
>> + int init_bits = (sector_num % 8) ? (8 - (sector_num % 8)) : 0;
>> + int remaining = sector_num - init_bits;
>> + int full_bytes = remaining / 8;
>> + int trail = remaining % 8;
>> +
>> + int len = !!init_bits + full_bytes + !!trail;
>> + uint8_t bitmap[len];
> This is a basically unbounded allocation on the stack. You should split
> this in smaller ranges using the "num_same" argument, which is what I
> did in my patch.
So if I understand your patch correctly, you read the next 512 bytes
(ie, one BDRV_SECTOR_SIZE) after offset into bitmap? Is this guaranteed
to be safe (like if the file isn't that long)? What if nb_sectors > 512 * 8?
I think it's best to use your version of cow_co_is_allocated(), but
those are the questions that come to mind when trying to convert the
stack allocation in cow_update_bitmap()
>> if (nb_sectors == 0) {
>> - *num_same = nb_sectors;
>> - return 0;
>> + *num_same = nb_sectors;
>> + return 0;
>> }
>>
>> - changed = is_bit_set(bs, sector_num);
>> - if (changed < 0) {
>> - return 0; /* XXX: how to return I/O errors? */
>> + ret = bdrv_pread(bs->file, offset, bitmap, len);
>> + if (ret < 0) {
>> + return ret;
>> }
>>
>> + changed = cow_test_bit(sector_num, bitmap);
>> for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
>> - if (is_bit_set(bs, sector_num + *num_same) != changed)
>> - break;
>> + if (cow_test_bit(sector_num + *num_same, bitmap) != changed) {
>> + break;
>> + }
>> }
>>
>> return changed;
>> }
>>
>> +/* Set the bits from sector_num to sector_num + nb_sectors in the bitmap of
>> + * bs->file. */
>> static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
>> int nb_sectors)
>> {
>> - int error = 0;
>> - int i;
>> + int ret;
>> + uint64_t offset = sizeof(struct cow_header_v2) + sector_num / 8;
>>
>> - for (i = 0; i < nb_sectors; i++) {
>> - error = cow_set_bit(bs, sector_num + i);
>> - if (error) {
>> - break;
>> - }
>> + int init_bits = (sector_num % 8) ? (8 - (sector_num % 8)) : 0;
>> + int remaining = sector_num - init_bits;
>> + int full_bytes = remaining / 8;
>> + int trail = remaining % 8;
>> +
>> + int len = !!init_bits + full_bytes + !!trail;
>> + uint8_t buf[len];
> Here your patch has indeed an improvement over mine. However, this is
> another basically unbounded allocation on the stack. You should split
> bitmap updates in smaller parts (doing 512-byte aligned writes is fine,
> each covers 2MB in the file and writes this big are very rare!).
>
>> + ret = bdrv_pread(bs->file, offset, buf, len);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> +
>> + /* Do sector_num -> nearest byte boundary */
>> + if (init_bits) {
>> + /* This sets the highest init_bits bits in the byte */
>> + uint8_t bits = ((1 << init_bits) - 1) << (8 - init_bits);
>> + buf[0] |= bits;
>> + }
>> +
>> + if (full_bytes) {
>> + memset(&buf[!!init_bits], ~0, full_bytes);
>> + }
>> +
>> + /* Set the trailing bits in the final byte */
>> + if (trail) {
>> + /* This sets the lowest trail bits in the byte */
>> + uint8_t bits = (1 << trail) - 1;
>> + buf[len - 1] |= bits;
>> + }
> ... and you should also check if there is a change in the bits, and skip
> the flush if there is no change. Flushing a multi-megabyte write is
> very expensive. It basically makes format=cow as slow as
> format=raw,cache=writethrough.
So if ORing the allocation makes no difference, don't flush?
Charlie
>> + ret = bdrv_pwrite(bs->file, offset, buf, len);
>> + if (ret < 0) {
>> + return ret;
>> }
>>
>> - return error;
>> + return 0;
>> }
>>
>> static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
>> @@ -237,6 +247,13 @@ static int cow_write(BlockDriverState *bs, int64_t sector_num,
>> return ret;
>> }
>>
>> + /* We need to flush the data before writing the metadata so that there is
>> + * no chance of metadata referring to data that doesn't exist. */
>> + ret = bdrv_flush(bs->file);
>> + if (ret < 0) {
>> + return ret;
>> + }
> See above about this flush.
>
> Paolo
>
>> return cow_update_bitmap(bs, sector_num, nb_sectors);
>> }
>>
>>
next prev parent reply other threads:[~2013-08-20 22:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-20 18:34 [Qemu-devel] [PATCH 1/2] Make cow_co_is_allocated and cow_update_bitmap more efficient Charlie Shepherd
2013-08-20 18:37 ` Charlie Shepherd
2013-08-20 20:48 ` Paolo Bonzini
2013-08-20 22:53 ` Charlie Shepherd [this message]
2013-08-21 8:14 ` Paolo Bonzini
2013-08-21 9:11 ` Charlie Shepherd
2013-08-21 9:19 ` Paolo Bonzini
2013-08-21 10:31 ` Charlie Shepherd
2013-08-21 10:38 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5213F37C.8090300@ctshepherd.com \
--to=charlie@ctshepherd.com \
--cc=gabriel@kerneis.info \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.