qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: John Snow <jsnow@redhat.com>, Kevin Wolf <kwolf@redhat.com>
Cc: famz@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com,
	stefanha@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v7] spec: add qcow2 bitmaps extension specification
Date: Thu, 21 Jan 2016 11:22:49 +0300	[thread overview]
Message-ID: <56A09559.7040805@virtuozzo.com> (raw)
In-Reply-To: <569FFA90.9040401@redhat.com>

On 21.01.2016 00:22, John Snow wrote:
>
> On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote:
>> On 19.01.2016 20:48, Kevin Wolf wrote:
>>> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> The new feature for qcow2: storing bitmaps.
>>>>
>>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>>
>>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>>> should be stored in this qcow2 file. The size of each bitmap
>>>> (considering its granularity) is equal to virtual disk size.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>
>>>> v7:
>>>>
>>>> - Rewordings, grammar.
>>>>     Max, Eric, John, thank you very much.
>>>>
>>>> - add last paragraph: remaining bits in bitmap data clusters must be
>>>>     zero.
>>>>
>>>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>>>     the request of Max.
>>>>
>>>> v6:
>>>>
>>>> - reword bitmap_directory_size description
>>>> - bitmap type: make 0 reserved
>>>> - extra_data_size: resize to 4bytes
>>>>     Also, I've marked this field as "must be zero". We can always change
>>>>     it, if we decide allowing managing app to specify any extra data, by
>>>>     defining some magic value as a top of user extra data.. So, for now
>>>>     non zeor extra_data_size should be considered as an error.
>>>> - swap name and extra_data to give good alignment to extra_data.
>>>>
>>>>
>>>> v5:
>>>>
>>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>>>     bitmaps.
>>>> - rewordings
>>>> - move upper bounds to "Notes about Qemu limits"
>>>> - s/should/must somewhere. (but not everywhere)
>>>> - move name_size field closer to name itself in bitmap header
>>>> - add extra data area to bitmap header
>>>> - move bitmap data description to separate section
>>>>
>>>>    docs/specs/qcow2.txt | 172
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 171 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..997239d 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
>>>> @@ -103,7 +103,18 @@ in the description of a field.
>>>>                        write to an image with unknown auto-clear
>>>> features if it
>>>>                        clears the respective bits from this field first.
>>>>    -                    Bits 0-63:  Reserved (set to 0)
>>>> +                    Bit 0:      Bitmaps extension bit
>>>> +                                This bit indicates consistency for
>>>> the bitmaps
>>>> +                                extension data.
>>>> +
>>>> +                                It is an error if this bit is set
>>>> without the
>>>> +                                bitmaps extension present.
>>>> +
>>>> +                                If the bitmaps extension is present
>>>> but this
>>>> +                                bit is unset, the bitmaps extension
>>>> data is
>>>> +                                inconsistent.
>>> It may as well be consistent, but we don't know.
>>>
>>> Perhaps something like "must be considered inconsistent" or "is
>>> potentially inconsistent".
>>>
>>>> +
>>>> +                    Bits 1-63:  Reserved (set to 0)
>>>>               96 -  99:  refcount_order
>>>>                        Describes the width of a reference count block
>>>> entry (width
>>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like
>>>> the following:
>>>>                            0x00000000 - End of the header extension area
>>>>                            0xE2792ACA - Backing file format name
>>>>                            0x6803f857 - Feature name table
>>>> +                        0x23852875 - Bitmaps extension
>>>>                            other      - Unknown header extension, can
>>>> be safely
>>>>                                         ignored
>>>>    @@ -166,6 +178,34 @@ the header extension data. Each entry look
>>>> like this:
>>>>                        terminated if it has full length)
>>>>      +== Bitmaps extension ==
>>>> +
>>>> +The bitmaps extension is an optional header extension. It provides
>>>> the ability
>>>> +to store bitmaps related to a virtual disk. For now, there is only
>>>> one bitmap
>>>> +type: the dirty tracking bitmap, which tracks virtual disk changes
>>>> from some
>>>> +point in time.
>>> I have one major problem with this patch, and it starts here.
>>>
>>> The spec talks about dirty tracking bitmaps all the way, but it never
>>> really defines what a dirty tracking bitmap even contains. It has a few
>>> hints here and there, but they aren't consistent.
>>>
>>> Here's the first hint: They track "virtual disk changes", which implies
>>> they track guest clusters rather than host clusters.
>>>
>>>> +The data of the extension should be considered consistent only if the
>>>> +corresponding auto-clear feature bit is set, see autoclear_features
>>>> above.
>>>> +
>>>> +The fields of the bitmaps extension are:
>>>> +
>>>> +          0 -  3:  nb_bitmaps
>>>> +                   The number of bitmaps contained in the image.
>>>> Must be
>>>> +                   greater than or equal to 1.
>>>> +
>>>> +                   Note: Qemu currently only supports up to 65535
>>>> bitmaps per
>>>> +                   image.
>>>> +
>>>> +          4 -  7:  bitmap_directory_size
>>>> +                   Size of the bitmap directory in bytes. It is the
>>>> cumulative
>>>> +                   size of all (nb_bitmaps) bitmap headers.
>>>> +
>>>> +          8 - 15:  bitmap_directory_offset
>>>> +                   Offset into the image file at which the bitmap
>>>> directory
>>>> +                   starts. Must be aligned to a cluster boundary.
>>>> +
>>>> +
>>>>    == Host cluster management ==
>>>>      qcow2 manages the allocation of host clusters by maintaining a
>>>> reference count
>>>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>>>              variable:   Padding to round up the snapshot table entry
>>>> size to the
>>>>                        next multiple of 8.
>>>> +
>>>> +
>>>> +== Bitmaps ==
>>>> +
>>>> +As mentioned above, the bitmaps extension provides the ability to
>>>> store bitmaps
>>>> +related a virtual disk. This section describes how these bitmaps are
>>>> stored.
>>> s/related/related to/
>>>
>>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>>> +
>>>> +=== Bitmap directory ===
>>>> +
>>>> +Each bitmap saved in the image is described in a bitmap directory
>>>> entry. The
>>>> +bitmap directory is a contiguous area in the image file, whose
>>>> starting offset
>>>> +and length are given by the header extension fields
>>>> bitmap_directory_offset and
>>>> +bitmap_directory_size. The entries of the bitmap directory have
>>>> variable
>>>> +length, depending on the length of the bitmap name and extra data.
>>>> These
>>>> +entries are also called bitmap headers.
>>>> +
>>>> +Structure of a bitmap directory entry:
>>>> +
>>>> +    Byte 0 -  7:    bitmap_table_offset
>>>> +                    Offset into the image file at which the bitmap
>>>> table
>>>> +                    (described below) for the bitmap starts. Must be
>>>> aligned to
>>>> +                    a cluster boundary.
>>>> +
>>>> +         8 - 11:    bitmap_table_size
>>>> +                    Number of entries in the bitmap table of the
>>>> bitmap.
>>>> +
>>>> +        12 - 15:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap was not saved correctly and may be
>>>> +                         inconsistent.
>>>> +
>>>> +                      1: auto
>>>> +                         The bitmap must reflect all changes of the
>>>> virtual
>>>> +                         disk by any application that would write to
>>>> this qcow2
>>>> +                         file (including writes, snapshot switching,
>>>> etc.). The
>>>> +                         type of this bitmap must be 'dirty tracking
>>>> bitmap'.
>>> This suggests that we can represent snapshot switching in a dirty
>>> tracking bitmap. Which is almost impossible if we track guest clusters,
>>> except if loading a snapshot should mean that all bits in the bitmap are
>>> set. However, that feels a bit useless and dirty tracking across
>>> snapshots doesn't seem to make a whole lot of sense anyway.
>>>
>>> Maybe what would make more sense is that a bitmap is tied to a specific
>>> snapshot or bitmaps are included in a snapshot, so that you can revert
>>> to the dirty status as it was when the snapshot was taken.
>>>
>>> Or, if you really meant, that the tracking words on a host cluster
>>> level, what clusters would be included in the bitmap? Would only the
>>> virtual disk be included? VM state? Any metadata?
>>>
>>> It seems we definitely need a new section on dirty tracking bitmaps that
>>> describes what the bitmap actually means and how it's supposed to work
>>> with snapshots. I guess we could also talk about how it works with other
>>> changes to the image like resizing.
>> Bitmaps track guest clusters.
>> Backup is not related to snapshots, so, I think set all bits in the
>> dirty bitmap on snapshot switch is ok. Of course we can't just ignore
>> snapshot switch, as it changes how user (and backup) sees the virtual
>> disk. Deleting the bitmap on snapshot switch is not good option too I
>> think.
>>
>> Bitmap size is defined to be equal to virtual disk size. So on resize,
>> the one who resize must resize the bitmap too, to maintain accordance of
>> the image file and the spec.
>>
>> I'll think about such section, ok.
>>
>>>> +                    Bits 2 - 31 are reserved and must be 0.
>>>> +
>>>> +             16:    type
>>>> +                    This field describes the sort of the bitmap.
>>>> +                    Values:
>>>> +                      1: Dirty tracking bitmap
>>>> +
>>>> +                    Values 0, 2 - 255 are reserved.
>>>> +
>>>> +             17:    granularity_bits
>>>> +                    Granularity bits. Valid values: 0 - 63.
>>>> +
>>>> +                    Note: Qemu currently doesn't support
>>>> granularity_bits
>>>> +                    greater than 31.
>>>> +
>>>> +                    Granularity is calculated as
>>>> +                        granularity = 1 << granularity_bits
>>>> +
>>>> +                    A bitmap's granularity is how many bytes of the
>>>> image
>>>> +                    accounts for one bit of the bitmap.
>>>> +
>>>> +        18 - 19:    name_size
>>>> +                    Size of the bitmap name. Must be non-zero.
>>>> +
>>>> +                    Note: Qemu currently doesn't support values
>>>> greater than
>>>> +                    1023.
>>>> +
>>>> +        20 - 23:    extra_data_size
>>>> +                    Size of type-specific extra data.
>>>> +
>>>> +                    For now, as no extra data is defined,
>>>> extra_data_size is
>>>> +                    reserved and must be zero.
>>>> +
>>>> +        variable:   Type-specific extra data for the bitmap.
>>> We talked about this in the other subthread.
>>>
>>>> +        variable:   The name of the bitmap (not null terminated).
>>>> Must be
>>>> +                    unique among all bitmap names within the bitmaps
>>>> extension.
>>>> +
>>>> +        variable:   Padding to round up the bitmap directory entry
>>>> size to the
>>>> +                    next multiple of 8.
>>>> +
>>>> +=== Bitmap table ===
>>>> +
>>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>>> +structure like for refcounts and guest clusters mapping) for the
>>>> mapping of
>>>> +bitmap data to host clusters. This structure is called the bitmap
>>>> table.
>>>> +
>>>> +Each bitmap table has a variable size (stored in the bitmap
>>>> directory Entry)
>>>> +and may use multiple clusters, however, it must be contiguous in the
>>>> image
>>>> +file.
>>>> +
>>>> +Structure of a bitmap table entry:
>>>> +
>>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>>> non-zero.
>>>> +                    If bits 9 - 55 are zero:
>>>> +                      0: Cluster should be read as all zeros.
>>>> +                      1: Cluster should be read as all ones.
>>>> +
>>>> +         1 -  8:    Reserved and must be zero.
>>>> +
>>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>>> aligned to
>>>> +                    a cluster boundary. If the offset is 0, the
>>>> cluster is
>>>> +                    unallocated; in that case, bit 0 determines how
>>>> this
>>>> +                    cluster should be treated when read from.
>>>> +
>>>> +        56 - 63:    Reserved and must be zero.
>>>> +
>>>> +=== Bitmap data ===
>>>> +
>>>> +As noted above, bitmap data is stored in separate clusters,
>>>> described by the
>>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>>> offset into
>>>> +the image file can be obtained as follows:
>>>> +
>>>> +    image_offset =
>>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>>> +            (bitmap_data_offset % cluster_size)
>>>> +
>>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>>> zero (see
>>>> +above).
>>>> +
>>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>>> granularity, the
>>>> +bit offset into the bitmap can be calculated like this:
>>>> +
>>>> +    bit_offset =
>>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>>> +            (byte_nr / granularity) % 8
>>>> +
>>>> +If the size of the bitmap data is not a multiply of cluster size
>>>> then the last
>>>> +cluster of the bitmap data contains some unused tail bits. These
>>>> bits must be
>>>> +zero.
>>> In which order are the bits stored in the bitmap?
>> What do you mean?
> He means BE or LE. You can intuit it from the formula, but it's not
> explicitly stated, I think

byte ordering? But it make sense when we deal with integers, which 
occupies more then 1 byte. Here is plain byte sequence. Bitmap. Like a 
string.. What should I write here additionally?

>>> Kevin
>>


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

  reply	other threads:[~2016-01-21  8:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 13:05 [Qemu-devel] [PATCH v7] spec: add qcow2 bitmaps extension specification Vladimir Sementsov-Ogievskiy
2016-01-12  0:30 ` John Snow
2016-01-14 11:35   ` Denis V. Lunev
2016-01-14 16:42     ` John Snow
2016-01-14 22:08 ` Eric Blake
2016-01-14 23:26   ` John Snow
2016-01-16 14:06     ` Vladimir Sementsov-Ogievskiy
2016-01-18 16:54       ` John Snow
2016-01-18 21:16         ` Eric Blake
2016-01-19  8:57           ` Vladimir Sementsov-Ogievskiy
2016-01-19 17:29             ` Kevin Wolf
2016-01-25 10:15               ` Vladimir Sementsov-Ogievskiy
2016-01-25 11:09                 ` Kevin Wolf
2016-01-25 12:27                   ` Vladimir Sementsov-Ogievskiy
2016-01-19 17:27           ` Kevin Wolf
2016-01-25 10:22             ` Vladimir Sementsov-Ogievskiy
2016-01-19 17:48 ` Kevin Wolf
2016-01-20 12:34   ` Vladimir Sementsov-Ogievskiy
2016-01-20 21:22     ` John Snow
2016-01-21  8:22       ` Vladimir Sementsov-Ogievskiy [this message]
2016-01-21  9:53         ` Kevin Wolf
2016-01-21 10:44           ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A09559.7040805@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).