From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44311) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8ZMu-0008E7-PF for qemu-devel@nongnu.org; Mon, 14 Dec 2015 15:05:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8ZMq-0001UO-KT for qemu-devel@nongnu.org; Mon, 14 Dec 2015 15:05:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56818) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8ZMq-0001Ty-BH for qemu-devel@nongnu.org; Mon, 14 Dec 2015 15:05:08 -0500 References: <1450115035-15473-1-git-send-email-vsementsov@virtuozzo.com> From: Max Reitz Message-ID: <566F20EE.9070409@redhat.com> Date: Mon, 14 Dec 2015 21:05:02 +0100 MIME-Version: 1.0 In-Reply-To: <1450115035-15473-1-git-send-email-vsementsov@virtuozzo.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="bcxNVUMjDfbCW5em8rLIwNA3iQOFsCM4M" Subject: Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy , qemu-devel@nongnu.org Cc: kwolf@redhat.com, famz@redhat.com, stefanha@redhat.com, den@openvz.org, jsnow@redhat.com This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --bcxNVUMjDfbCW5em8rLIwNA3iQOFsCM4M Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote: > The new feature for qcow2: storing dirty bitmaps. >=20 > Only dirty bitmaps relative to this qcow2 image should be stored in it.= >=20 > Strings started from +# are RFC-strings, not to be commited of course. >=20 >=20 > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- >=20 > docs/specs/qcow2.txt | 151 +++++++++++++++++++++++++++++++++++++++++++= +++++++- > 1 file changed, 150 insertions(+), 1 deletion(-) Overall: Looks better to me. Good enough for me to ACK it, but I still have some issues with it. Let's evaluate the main point of critique I had: I really want this not to be qemu-specific but potentially useful to all programs. Pretty good: You do implicitly describe what a (dirty) bitmap looks like by describing how to obtain the bit offset of a certain byte guest offset. So it's not an opaque binary data dump anymore. (Why only "pretty good"? I find the description to be a bit too "implicit", I think a separate section describing the bitmap structure would be better.) Good: The bitmap actually describes the qcow2 file. Not so good: While now any program knows how to read the bitmap and that it does refer to this qcow2 file, it's interpretation is not so easy still. Generally, a dirty bitmap has some reference point, that is the state of the disk when the bitmap was cleared or created. For instance, for incremental backups, whenever you create a backup based on a dirty bitmap, the dirty bitmap is cleared and the backup target is then said reference point. I think it would be nice to put that reference point (i.e. the name of an image file that contains the clean image) into the dirty bitmap header, if possible. (Note: I won't comment on orthography, because I feel like that is something a native speaker should do. O:-)) > diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt > index 121dfc8..3c89580 100644 > --- a/docs/specs/qcow2.txt > +++ b/docs/specs/qcow2.txt > @@ -103,7 +103,17 @@ in the description of a field. > write to an image with unknown auto-clear features= if it > clears the respective bits from this field first. > =20 > - Bits 0-63: Reserved (set to 0) > + Bit 0: Dirty bitmaps bit. > + This bit is responsible for Dirty bitm= aps > + extension consistency. > + If it is set, but there is no Dirty bi= tmaps > + extensions, this should be considered = as an > + error. > + If it is not set, but there is a Dirty= bitmaps > + extension, its data should be consider= ed as > + inconsistent. > + > + Bits 1-63: Reserved (set to 0) > =20 > 96 - 99: refcount_order > Describes the width of a reference count block ent= ry (width > @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the = following: > 0x00000000 - End of the header extension area > 0xE2792ACA - Backing file format name > 0x6803f857 - Feature name table > + 0x23852875 - Dirty bitmaps > other - Unknown header extension, can be = safely > ignored > =20 > @@ -166,6 +177,31 @@ the header extension data. Each entry look like th= is: > terminated if it has full length) > =20 > =20 > +=3D=3D Dirty bitmaps =3D=3D > + > +Dirty bitmaps is an optional header extension. It provides an ability = to store > +dirty bitmaps in a qcow2 image. The data of this extension should be c= onsidered > +as consistent only if corresponding auto-clear feature bit is set (see= > +autoclear_features above). > +The fields of Dirty bitmaps extension are: > + > + 0 - 3: nb_dirty_bitmaps > + The number of dirty bitmaps contained in the image.= Valid > + values: 1 - 65535. Again, I don't see a reason for why we should impose a strict upper limit here. I'd prefer "Note that qemu currently only supports up to 65535 dirty bitmaps per image." > +# Let's be strict, the feature should be deleted with deleting last bi= tmap. > + > + 4 - 7: dirty_bitmap_directory_size > + Size of the Dirty Bitmap Directory in bytes. It sho= uld be > + equal to sum of sizes of all (nb_dirty_bitmaps) dir= ty bitmap > + headers. No, it "should" not be equal, it *must* be equal. But I think you can just omit that last sentence, that would be just as fine. > +# This field is necessary to effectively read Dirty Bitmap Directory, = because > +# it's entries (which are dirty bitmap headers) may have different len= gths. > + > + 8 - 15: dirty_bitmap_directory_offset > + Offset into the image file at which the Dirty Bitma= p > + Directory starts. Must be aligned to a cluster boun= dary. > + > + > =3D=3D Host cluster management =3D=3D > =20 > qcow2 manages the allocation of host clusters by maintaining a referen= ce count > @@ -360,3 +396,116 @@ Snapshot table entry: > =20 > variable: Padding to round up the snapshot table entry size = to the > next multiple of 8. > + > + > +=3D=3D Dirty bitmaps =3D=3D > + > +The feature supports storing dirty bitmaps in a qcow2 image. All dirty= bitmaps > +are relating to the virtual disk, stored in this image. > + > +=3D=3D=3D Dirty Bitmap Directory =3D=3D=3D > + > +Each dirty bitmap saved in the image is described in a Dirty Bitmap Di= rectory > +entry. Dirty Bitmap Directory is a contiguous area in the image file, = whose > +starting offset and length are given by the header extension fields > +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The ent= ries of > +the bitmap directory have variable length, depending on the length of = the > +bitmap name. These entries are also called dirty bitmap headers. > + > +Dirty Bitmap Directory Entry: > + > + Byte 0 - 7: dirty_bitmap_table_offset > + Offset into the image file at which the Dirty Bitm= ap Table > + (described below) for the bitmap starts. Must be a= ligned to > + a cluster boundary. > + > + 8 - 11: dirty_bitmap_table_size > + Number of entries in the Dirty Bitmap Table of the= bitmap. > + > + 12 - 15: flags > + Bit > + 0: in_use > + The bitmap was not saved correctly and may be= > + inconsistent. > + > + 1: auto > + The bitmap should be autoloaded as block dirt= y bitmap > + and tracking should be started. Type of the b= itmap > + should be 'Dirty Tracking Bitmap'. I find the wording a bit too qemu-specific. How about this: This bitmap is the default dirty bitmap for the virtual disk represented by this qcow2 image. It should track all write accesses immediately after the image has been opened. And I find the "should" in "Type of the bitmap should be..." a bit too weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is bette= r. > + > + Bits 2 - 31 are reserved and must be 0. > + > + 16 - 17: name_size > + Size of the bitmap name. Valid values: 1 - 1023. > + > + 18: type > + This field describes the sort of the bitmap. > + Values: > + 0: Dirty Tracking Bitmap If we allow different kinds of bitmaps, it should not be called "dirty bitmap" everywhere anymore. > + > + Values 1 - 255 are reserved. > +# This is mostly for error checking and information in qemu-img info o= utput. > +# The other types may be, for example, "Backup Bitmap" - to make it po= ssible > +# stop backup job on vm stop and resume it later. The another one is "= Sector > +# Alloction Bitmap" (Fam, John, please comment). I'm waiting for their comments because that sounds like "refcount table with refcount_bits=3D1" to me. :-) > + 19: granularity_bits > + Granularity bits. Valid values are: 0 - 31. > +# Now, qemu allows creating bitmaps with granularity as a 32bit value.= And > +# there are no reasons of increasing it. Good (implicit) question. I can't imagine any reason for wanting to have a coarser granularity than 2 GB, but I do think there may be a need in the future for some people. Once again, I think we should discriminate between what is generally a useful limitation and what is simply due to qemu not supporting anything else right now. Thus, I think it would be better to increase the range to 0 - 63 and make a note that qemu only supports values up to 31 right now. > + > + Granularity is calculated as > + granularity =3D 1 << granularity_bits > + > + Granularity of the bitmap is how many bytes of the= image > + accounts for one bit of the bitmap. > +# To be closer to qcow2 and its reality, I've decided to use byte-gran= ularity > +# here, not sector-granularity. I like that. But do note that qcow2 does align everything at least to 512 bytes, so having used sector granularity wouldn't have been too bad. > + > + variable: The name of the bitmap (not null terminated). Shou= ld be > + unique among all dirty bitmap names within the Dir= ty > + bitmaps extension. > + > + variable: Padding to round up the Dirty Bitmap Directory Ent= ry size > + to the next multiple of 8. What I'd like here is variable additional information based on the bitmap type. For some types, this may be absolutely necessary; for dirty tracking bitmaps it depends on what we do about the reference point thing= =2E The reference point thing is the following: As mentioned at the top, I'd like there to be some kind of description of what the clean state was. As far as I know, this is generally a backup in the form of a file. In that case, we could put that filename here. I don't think not having a reference point description is a serious show stopper. qemu itself does rely on the management layer to know which bitmap to use when. But I think it would be pretty nice to have it here. > + > +=3D=3D=3D Dirty Bitmap Table =3D=3D=3D > + > +Dirty bitmaps are stored using a one-level (not two-level like refcoun= ts and > +guest clusters mapping) structure for the mapping of bitmaps to host c= lusters. > +It is called Dirty Bitmap Table. > + > +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitma= p > +Directory Entry) and may use multiple clusters, however it must be con= tiguous > +in the image file. > + > +Given an offset (in bytes) into the bitmap, the offset into the image = file can > +be obtained as follows: > + > + byte_offset =3D > + dirty_bitmap_table[offset / cluster_size] + (offset % cluster_= size) > + > +Taking into account the granularity of the bitmap, an offset in bits i= nto the > +image file, corresponding to byte number byte_nr of the image can be c= alculated > +like this: > + > + bit_offset =3D > + byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granul= arity) % 8 > + > +Note: the last formula is only for understanding the things, it is unl= ikely for > +it to be useful in a program. I think this note is superfluous. All the pseudo-code in this file is only that, pseudo-code. ;-) Apart from that, I think this last formula should be in its own section ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a bitmap. Putting this term there should basically suffice. I was about to say I'd like it to define the bit order, too (i.e. "bit offset 0 is the LSb"), but, well, it just uses the bit order used everywhere in qcow2. > + > +Dirty Bitmap Table entry: > + > + Bit 0: Reserved and should be zero if bits 9 - 55 are non= -zero. > + If bits 9 - 55 are zero: > + 0: Cluster should be read as all zeros. > + 1: Cluster should be read as all ones. > + > + 1 - 8: Reserved and must be zero. > + > + 9 - 55: Bits 9 - 55 of host cluster offset. Must be aligne= d to a > + cluster boundary. If the offset is 0, the cluster = is > + unallocated, see bit 0 description. > + > + 56 - 63: Reserved, must be 0. >=20 Looks good. Max --bcxNVUMjDfbCW5em8rLIwNA3iQOFsCM4M Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJWbyDuAAoJEDuxQgLoOKytCZ8H/iKDs71YKWmAXhmZmJlk8A4W mTXC5lpM4VZzQg64J0TfmIMr3v5NKCoa99x9+z1YKJv6YILF9MY+mAmw4A/WMls+ i7Gg+jnJzDqbi55pyM8P5tcSnUR2IMkof0IPgixL8aZybbtnPJaovnIWi88mdYJ6 7cd5cex2ByM+tpYHqvoo7wCbcoH9rBMvwMR0c92DARsKiZdQNdvFvv8lPGukJPJw Xmu4rW+5+WoRI0A6jSk5Qw6Lnd5InoHuIY2QVR+5HzFJPNiIwzhALtcWIBSYaa84 T0Hx/MDql/0S2cd1qIElOsibu5noPyCduRD5dOh5Vr635HWmD27fXFZ11WwPyHw= =xL+z -----END PGP SIGNATURE----- --bcxNVUMjDfbCW5em8rLIwNA3iQOFsCM4M--