From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33151) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XtDtD-0002Q1-1e for qemu-devel@nongnu.org; Tue, 25 Nov 2014 06:02:41 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XtDt5-00005Y-5K for qemu-devel@nongnu.org; Tue, 25 Nov 2014 06:02:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42427) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XtDt4-00005P-Tm for qemu-devel@nongnu.org; Tue, 25 Nov 2014 06:02:27 -0500 Message-ID: <547461BB.2000509@redhat.com> Date: Tue, 25 Nov 2014 12:02:19 +0100 From: Max Reitz MIME-Version: 1.0 References: <1416844620-17717-1-git-send-email-mreitz@redhat.com> <1416844620-17717-2-git-send-email-mreitz@redhat.com> In-Reply-To: <1416844620-17717-2-git-send-email-mreitz@redhat.com> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 01/12] qcow2: Add new overlap check functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Kevin Wolf , Peter Lieven , Stefan Hajnoczi On 2014-11-24 at 16:56, Max Reitz wrote: > The existing qcow2 metadata overlap detection function used existing > structures to determine the location of the image metadata, from plain > fields such as l1_table_offset and l1_size in the BDRVQcowState, over > image structures in memory such as the L1 table for the L2 tables' > positions, or it even read the required data directly from disk for > every requested check, such as the snapshot L1 tables for the inactive > L2 tables' positions. > > These new functions instead keep a dedicated structure for keeping track > of the metadata positions in memory. It consists of two parts: First, > there is one structure which is basically a list of all metadata > structures. Each entry has a bitmask of types (because some metadata > structures may actually overlap, such as active and inactive L2 tables), > a number of clusters occupied and the offset from the previous entry in > clusters. This structure requires relatively few memory, but checking a > certain point may take relatively long. Each entry is called a > "fragment". > > Therefore, there is another representation which is a bitmap, or rather > a bytemap, of metadata types. The previously described list is split > into multiple windows with each describing a constant number of clusters > (WINDOW_SIZE). If the list is to be queried or changed, the respective > window is selected in constant time and the bitmap is generated from the > fragments belonging to the window. This bitmap can then be queried in > constant time and easily be changed. > > Because the bitmap representation requires more memory, it is only used > as a cache. Whenever a window is removed from the cache, the fragment > list will be rebuilt from the bitmap if the latter has been modified. > Therefore, the fragment list is only used as the background > representation to save memory, whereas the bitmap is used whenever > possible. > > Regarding the size of the fragment list in memory: As one refcount block > can handle cluster_size / 2 entries and one L2 table can handle > cluster_size / 8 entries, for a qcow2 image with the standard cluster > size of 64 kB, there is a ratio of data to metadata of about 1/6000 > (1/32768 refblocks and 1/8192 L2 tables) if one ignores the fact that > not every cluster requires an L2 table entry. The refcount table and the > L1 table is generally negligible. At the worst, each metadata cluster > requires its own entry in the fragment list; each entry takes up four > bytes, therefore, at the worst, the fragment list should take up (for an > image with 64 kB clusters) (4 B) / (64 kB * 6000) of the image size, > which is about 1.e-8 (i.e., 11 kB for a 1 TB image, or 11 MB for a 1 PB > image). > > Signed-off-by: Max Reitz > --- > block/Makefile.objs | 3 +- > block/qcow2-overlap.c | 404 ++++++++++++++++++++++++++++++++++++++++++++++++++ > block/qcow2.h | 13 ++ > 3 files changed, 419 insertions(+), 1 deletion(-) > create mode 100644 block/qcow2-overlap.c > > diff --git a/block/qcow2-overlap.c b/block/qcow2-overlap.c > new file mode 100644 > index 0000000..9f3f2aa > --- /dev/null > +++ b/block/qcow2-overlap.c [snip] > + > +/** > + * Destroys the cached window bitmap. If it has been modified, the fragment list > + * will be rebuilt accordingly. > + */ > +static void destroy_window_bitmap(Qcow2MetadataList *mdl, > + Qcow2MetadataWindow *window) > +{ > + if (!window->bitmap) { > + return; > + } > + > + if (window->bitmap_modified) { > + int bitmap_i, fragment_i = 0; > + QCow2MetadataOverlap current_types = 0; > + int current_nb_clusters = 0; > + > + /* Rebuild the fragment list; the case bitmap_i == WINDOW_SIZE is for > + * entering the last fragment at the bitmap end */ > + > + for (bitmap_i = 0; bitmap_i <= WINDOW_SIZE; bitmap_i++) { > + /* Qcow2MetadataFragment::nb_clusters is a uint8_t, so > + * current_nb_clusters may not exceed 255 */ > + if (bitmap_i < WINDOW_SIZE && > + current_types == window->bitmap[bitmap_i] && > + current_nb_clusters < 255) > + { > + current_nb_clusters++; > + } else { > + if (current_types && current_nb_clusters) { > + if (fragment_i >= window->fragments_array_size) { > + window->fragments_array_size = > + 3 * window->fragments_array_size / 2 + 1; > + > + /* new_nb_fragments should be small enough, and there is > + * nothing we can do on failure anyway, so do not use > + * g_try_renew() here */ > + window->fragments = > + g_renew(Qcow2MetadataFragment, window->fragments, > + window->fragments_array_size); > + } > + > + window->fragments[fragment_i++] = (Qcow2MetadataFragment){ > + .types = current_types, > + .nb_clusters = current_nb_clusters, > + .relative_start = bitmap_i - current_nb_clusters, > + }; > + } > + > + current_nb_clusters = 0; Because I don't want to write a new version every time I find a single bug myself, I'll just reply here, so you know which things I know about. This should be 1, and not 0. At this point, we have found the first cluster of this fragment, which is why this should be 1. Max > + if (bitmap_i < WINDOW_SIZE) { > + current_types = window->bitmap[bitmap_i]; > + } > + } > + } > + > + window->nb_fragments = fragment_i; > + } > + > + g_free(window->bitmap); > + window->bitmap = NULL; > +}