From: Charlie Shepherd <charlie@ctshepherd.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kwolf@redhat.com, stefanha@gmail.com, gabriel@kerneis.info,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/2] Make cow_co_is_allocated and cow_update_bitmap more efficient
Date: Tue, 20 Aug 2013 23:53:48 +0100 [thread overview]
Message-ID: <5213F37C.8090300@ctshepherd.com> (raw)
In-Reply-To: <5213D628.4030409@redhat.com>
On 20/08/2013 21:48, Paolo Bonzini wrote:
> Il 20/08/2013 20:34, Charlie Shepherd ha scritto:
>> /* Return true if first block has been changed (ie. current version is
>> @@ -146,40 +114,82 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>> static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>> int64_t sector_num, int nb_sectors, int *num_same)
>> {
>> - int changed;
>> + int ret, changed;
>> + uint64_t offset = sizeof(struct cow_header_v2) + sector_num / 8;
>> +
>> + int init_bits = (sector_num % 8) ? (8 - (sector_num % 8)) : 0;
>> + int remaining = sector_num - init_bits;
>> + int full_bytes = remaining / 8;
>> + int trail = remaining % 8;
>> +
>> + int len = !!init_bits + full_bytes + !!trail;
>> + uint8_t bitmap[len];
> This is a basically unbounded allocation on the stack. You should split
> this in smaller ranges using the "num_same" argument, which is what I
> did in my patch.
So if I understand your patch correctly, you read the next 512 bytes
(ie, one BDRV_SECTOR_SIZE) after offset into bitmap? Is this guaranteed
to be safe (like if the file isn't that long)? What if nb_sectors > 512 * 8?
I think it's best to use your version of cow_co_is_allocated(), but
those are the questions that come to mind when trying to convert the
stack allocation in cow_update_bitmap()
>> if (nb_sectors == 0) {
>> - *num_same = nb_sectors;
>> - return 0;
>> + *num_same = nb_sectors;
>> + return 0;
>> }
>>
>> - changed = is_bit_set(bs, sector_num);
>> - if (changed < 0) {
>> - return 0; /* XXX: how to return I/O errors? */
>> + ret = bdrv_pread(bs->file, offset, bitmap, len);
>> + if (ret < 0) {
>> + return ret;
>> }
>>
>> + changed = cow_test_bit(sector_num, bitmap);
>> for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
>> - if (is_bit_set(bs, sector_num + *num_same) != changed)
>> - break;
>> + if (cow_test_bit(sector_num + *num_same, bitmap) != changed) {
>> + break;
>> + }
>> }
>>
>> return changed;
>> }
>>
>> +/* Set the bits from sector_num to sector_num + nb_sectors in the bitmap of
>> + * bs->file. */
>> static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
>> int nb_sectors)
>> {
>> - int error = 0;
>> - int i;
>> + int ret;
>> + uint64_t offset = sizeof(struct cow_header_v2) + sector_num / 8;
>>
>> - for (i = 0; i < nb_sectors; i++) {
>> - error = cow_set_bit(bs, sector_num + i);
>> - if (error) {
>> - break;
>> - }
>> + int init_bits = (sector_num % 8) ? (8 - (sector_num % 8)) : 0;
>> + int remaining = sector_num - init_bits;
>> + int full_bytes = remaining / 8;
>> + int trail = remaining % 8;
>> +
>> + int len = !!init_bits + full_bytes + !!trail;
>> + uint8_t buf[len];
> Here your patch has indeed an improvement over mine. However, this is
> another basically unbounded allocation on the stack. You should split
> bitmap updates in smaller parts (doing 512-byte aligned writes is fine,
> each covers 2MB in the file and writes this big are very rare!).
>
>> + ret = bdrv_pread(bs->file, offset, buf, len);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> +
>> + /* Do sector_num -> nearest byte boundary */
>> + if (init_bits) {
>> + /* This sets the highest init_bits bits in the byte */
>> + uint8_t bits = ((1 << init_bits) - 1) << (8 - init_bits);
>> + buf[0] |= bits;
>> + }
>> +
>> + if (full_bytes) {
>> + memset(&buf[!!init_bits], ~0, full_bytes);
>> + }
>> +
>> + /* Set the trailing bits in the final byte */
>> + if (trail) {
>> + /* This sets the lowest trail bits in the byte */
>> + uint8_t bits = (1 << trail) - 1;
>> + buf[len - 1] |= bits;
>> + }
> ... and you should also check if there is a change in the bits, and skip
> the flush if there is no change. Flushing a multi-megabyte write is
> very expensive. It basically makes format=cow as slow as
> format=raw,cache=writethrough.
So if ORing the allocation makes no difference, don't flush?
Charlie
>> + ret = bdrv_pwrite(bs->file, offset, buf, len);
>> + if (ret < 0) {
>> + return ret;
>> }
>>
>> - return error;
>> + return 0;
>> }
>>
>> static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
>> @@ -237,6 +247,13 @@ static int cow_write(BlockDriverState *bs, int64_t sector_num,
>> return ret;
>> }
>>
>> + /* We need to flush the data before writing the metadata so that there is
>> + * no chance of metadata referring to data that doesn't exist. */
>> + ret = bdrv_flush(bs->file);
>> + if (ret < 0) {
>> + return ret;
>> + }
> See above about this flush.
>
> Paolo
>
>> return cow_update_bitmap(bs, sector_num, nb_sectors);
>> }
>>
>>
next prev parent reply other threads:[~2013-08-20 22:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-20 18:34 [Qemu-devel] [PATCH 1/2] Make cow_co_is_allocated and cow_update_bitmap more efficient Charlie Shepherd
2013-08-20 18:37 ` Charlie Shepherd
2013-08-20 20:48 ` Paolo Bonzini
2013-08-20 22:53 ` Charlie Shepherd [this message]
2013-08-21 8:14 ` Paolo Bonzini
2013-08-21 9:11 ` Charlie Shepherd
2013-08-21 9:19 ` Paolo Bonzini
2013-08-21 10:31 ` Charlie Shepherd
2013-08-21 10:38 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5213F37C.8090300@ctshepherd.com \
--to=charlie@ctshepherd.com \
--cc=gabriel@kerneis.info \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).