From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1K20WZ-0003tB-5T for qemu-devel@nongnu.org; Fri, 30 May 2008 05:03:47 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1K20WW-0003s1-Bj for qemu-devel@nongnu.org; Fri, 30 May 2008 05:03:45 -0400 Received: from [199.232.76.173] (port=42860 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K20WU-0003rm-5E for qemu-devel@nongnu.org; Fri, 30 May 2008 05:03:42 -0400 Received: from rv-out-0708.google.com ([209.85.198.244]:49294) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1K20WT-0001QG-RY for qemu-devel@nongnu.org; Fri, 30 May 2008 05:03:42 -0400 Received: by rv-out-0708.google.com with SMTP id f25so3612216rvb.22 for ; Fri, 30 May 2008 02:03:40 -0700 (PDT) Message-ID: Date: Fri, 30 May 2008 02:03:40 -0700 From: "Marc Bevand" MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: [Qemu-devel] [PATCH] New qemu-img convert -B option to preserve the COW aspect of images and/or re-base them Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org [PATCH] New qemu-img convert -B option to preserve the COW aspect of images and/or re-base them If a disk image hd_a is a copy-on-write image based on the backing file hd_base, it is currently impossible to use qemu-img to convert hd_a to hd_b (possibly using another disk image format) while keeping hd_b a copy-on-write image of hd_base. qemu-img also doesn't provide a feature that would let an enduser re-base a image, for example: adjust hd_a's backing file name from hd_base to hd_base2 if it had to change for some reason. This patch solves the 2 above problems by adding a new qemu-img convert -B option. This is a generic feature that should work with ANY disk image format supporting backing files. Examples: $ qemu-img info hd_a image: hd_a file format: qcow virtual size: 6.0G (6442450944 bytes) disk size: 28K cluster_size: 512 backing file: hd_base (actual path: hd_base) Converting hd_a (qcow) to hd_b (qcow2) while preserving the copy-on-write aspect of the image: $ qemu-img convert hd_a -O qcow2 -B hd_base hd_b $ qemu-img info hd_b image: hd_b file format: qcow2 virtual size: 6.0G (6442450944 bytes) disk size: 36K cluster_size: 4096 backing file: hd_base (actual path: hd_base) Renaming the backing file without losing hd_a: $ ln hd_base hd_base2 $ qemu-img convert hd_a -O qcow -B hd_base2 hd_a2 $ mv hd_a2 hd_a $ rm hd_base $ qemu-img info hd_a image: hd_a file format: qcow virtual size: 6.0G (6442450944 bytes) disk size: 28K cluster_size: 512 backing file: hd_base2 (actual path: hd_base2) Patch made against SVN's rev 4622. Signed-off-by: Marc Bevand gmail.com> Index: qemu-img.texi =================================================================== --- qemu-img.texi (revision 4622) +++ qemu-img.texi (working copy) @@ -10,7 +10,7 @@ @table @option @item create [-e] [-6] [-b @var{base_image}] [-f @var{fmt}] @var{filename} [@var{size}] @item commit [-f @var{fmt}] @var{filename} -@item convert [-c] [-e] [-6] [-f @var{fmt}] @var{filename} [-O @var{output_fmt}] @var{output_filename} +@item convert [-c] [-e] [-6] [-f @var{fmt}] [-O @var{output_fmt}] [-B @var{output_base_image}] @var{filename} [@var{filename2} [...]] @var{output_filename} @item info [-f @var{fmt}] @var{filename} @end table @@ -21,7 +21,11 @@ @item base_image is the read-only disk image which is used as base for a copy on write image; the copy on write image only stores the modified data - +@item output_base_image +forces the output image to be created as a copy on write +image of the specified base image; @code{output_base_image} should have the same +content as the input's base image, however the path, image format, etc may +differ @item fmt is the disk image format. It is guessed automatically in most cases. The following formats are supported: Index: block.c =================================================================== --- block.c (revision 4622) +++ block.c (working copy) @@ -884,6 +884,32 @@ bdrv_flush(bs->backing_hd); } +/* + * Returns true iff the specified sector is present in the disk image. Drivers + * not implementing the functionality are assumed to not support backing files, + * hence all their sectors are reported as allocated. + * + * 'pnum' is set to the number of sectors (including and immediately following + * the specified sector) that are known to be in the same + * allocated/unallocated state. + * + * 'nb_sectors' is the max value 'pnum' should be set to. + */ +int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, + int *pnum) +{ + if (!bs->drv->bdrv_is_allocated) { + if (sector_num >= bs->total_sectors) { + *pnum = 0; + return 0; + } + int64_t n = bs->total_sectors - sector_num; + *pnum = (n < nb_sectors) ? (n) : (nb_sectors); + return 1; + } + return bs->drv->bdrv_is_allocated(bs, sector_num, nb_sectors, pnum); +} + #ifndef QEMU_IMG void bdrv_info(void) { Index: block.h =================================================================== --- block.h (revision 4622) +++ block.h (working copy) @@ -99,6 +99,8 @@ /* Ensure contents are flushed to disk. */ void bdrv_flush(BlockDriverState *bs); +int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, + int *pnum); #define BDRV_TYPE_HD 0 #define BDRV_TYPE_CDROM 1 Index: qemu-img.c =================================================================== --- qemu-img.c (revision 4622) +++ qemu-img.c (working copy) @@ -55,13 +55,17 @@ "Command syntax:\n" " create [-e] [-6] [-b base_image] [-f fmt] filename [size]\n" " commit [-f fmt] filename\n" - " convert [-c] [-e] [-6] [-f fmt] [-O output_fmt] filename [filename2 [...]] output_filename\n" + " convert [-c] [-e] [-6] [-f fmt] [-O output_fmt] [-B output_base_image] filename [filename2 [...]] output_filename\n" " info [-f fmt] filename\n" "\n" "Command parameters:\n" " 'filename' is a disk image filename\n" " 'base_image' is the read-only disk image which is used as base for a copy on\n" " write image; the copy on write image only stores the modified data\n" + " 'output_base_image' forces the output image to be created as a copy on write\n" + " image of the specified base image; 'output_base_image' should have the same\n" + " content as the input's base image, however the path, image format, etc may\n" + " differ\n" " 'fmt' is the disk image format. It is guessed automatically in most cases\n" " 'size' is the disk image size in kilobytes. Optional suffixes 'M' (megabyte)\n" " and 'G' (gigabyte) are supported\n" @@ -350,6 +354,13 @@ return 0; } +/* + * Returns true iff the first sector pointed to by 'buf' contains at least + * a non-NUL byte. + * + * 'pnum' is set to the number of sectors (including and immediately following + * the first one) that are known to be in the same allocated/unallocated state. + */ static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum) { int v, i; @@ -373,7 +384,7 @@ static int img_convert(int argc, char **argv) { int c, ret, n, n1, bs_n, bs_i, flags, cluster_size, cluster_sectors; - const char *fmt, *out_fmt, *out_filename; + const char *fmt, *out_fmt, *out_baseimg, *out_filename; BlockDriver *drv; BlockDriverState **bs, *out_bs; int64_t total_sectors, nb_sectors, sector_num, bs_offset; @@ -384,9 +395,10 @@ fmt = NULL; out_fmt = "raw"; + out_baseimg = NULL; flags = 0; for(;;) { - c = getopt(argc, argv, "f:O:hce6"); + c = getopt(argc, argv, "f:O:B:hce6"); if (c == -1) break; switch(c) { @@ -399,6 +411,9 @@ case 'O': out_fmt = optarg; break; + case 'B': + out_baseimg = optarg; + break; case 'c': flags |= BLOCK_FLAG_COMPRESS; break; @@ -415,6 +430,9 @@ if (bs_n < 1) help(); out_filename = argv[argc - 1]; + + if (bs_n > 1 && out_baseimg) + error("-B makes no sense when concatenating multiple input images"); bs = calloc(bs_n, sizeof(BlockDriverState *)); if (!bs) @@ -441,7 +459,7 @@ if (flags & BLOCK_FLAG_ENCRYPT && flags & BLOCK_FLAG_COMPRESS) error("Compression and encryption not supported at the same time"); - ret = bdrv_create(drv, out_filename, total_sectors, NULL, flags); + ret = bdrv_create(drv, out_filename, total_sectors, out_baseimg, flags); if (ret < 0) { if (ret == -ENOTSUP) { error("Formatting not supported for file format '%s'", fmt); @@ -520,7 +538,7 @@ /* signal EOF to align */ bdrv_write_compressed(out_bs, 0, NULL, 0); } else { - sector_num = 0; + sector_num = 0; // total number of sectors converted so far for(;;) { nb_sectors = total_sectors - sector_num; if (nb_sectors <= 0) @@ -543,6 +561,20 @@ if (n > bs_offset + bs_sectors - sector_num) n = bs_offset + bs_sectors - sector_num; + /* If the output image is being created as a copy on write image, + assume that sectors which are unallocated in the input image + are present in both the output's and input's base images (no + need to copy them). */ + if (out_baseimg) { + if (!bdrv_is_allocated(bs[bs_i], sector_num - bs_offset, n, &n1)) { + sector_num += n1; + continue; + } + /* The next 'n1' sectors are allocated in the input image. Copy + only those as they may be followed by unallocated sectors. */ + n = n1; + } + if (bdrv_read(bs[bs_i], sector_num - bs_offset, buf, n) < 0) error("error while reading"); /* NOTE: at the same time we convert, we do not write zero @@ -550,7 +582,10 @@ should add a specific call to have the info to go faster */ buf1 = buf; while (n > 0) { - if (is_allocated_sectors(buf1, n, &n1)) { + /* If the output image is being created as a copy on write image, + copy all sectors even the ones containing only NUL bytes, + because they may differ from the sectors in the base image. */ + if (out_baseimg || is_allocated_sectors(buf1, n, &n1)) { if (bdrv_write(out_bs, sector_num, buf1, n1) < 0) error("error while writing"); }