From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=58460 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PgG7j-0007F2-At for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:29:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PgG7d-0007TF-Mf for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:29:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48986) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PgG7d-0007Sj-FB for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:29:45 -0500 Message-ID: <4D397C8E.7080703@redhat.com> Date: Fri, 21 Jan 2011 13:31:10 +0100 From: Kevin Wolf MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB References: <1295449188-17877-1-git-send-email-Pierre.Riteau@irisa.fr> <04350B7C-9933-4A70-8FA9-B5B409D1E10A@irisa.fr> <43211019-BF0D-405A-99B7-54C9B3BBE58E@irisa.fr> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yoshiaki Tamura Cc: "qemu-devel@nongnu.org" , Pierre Riteau Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> Le 20 janv. 2011 =E0 17:18, Yoshiaki Tamura a =E9crit : >> >>> 2011/1/20 Pierre Riteau : >>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>>> >>>>> 2011/1/19 Pierre Riteau : >>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the retu= rn >>>>>> value of bdrv_write and aborts migration when it fails. However, i= f the >>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZ= E >>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>>>>> >>>>>> Fixed by calling bdrv_write with the correct size of the last bloc= k. >>>>>> --- >>>>>> block-migration.c | 16 +++++++++++++++- >>>>>> 1 files changed, 15 insertions(+), 1 deletions(-) >>>>>> >>>>>> diff --git a/block-migration.c b/block-migration.c >>>>>> index 1475325..eeb9c62 100644 >>>>>> --- a/block-migration.c >>>>>> +++ b/block-migration.c >>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaqu= e, int version_id) >>>>>> int64_t addr; >>>>>> BlockDriverState *bs; >>>>>> uint8_t *buf; >>>>>> + int64_t total_sectors; >>>>>> + int nr_sectors; >>>>>> >>>>>> do { >>>>>> addr =3D qemu_get_be64(f); >>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opa= que, int version_id) >>>>>> return -EINVAL; >>>>>> } >>>>>> >>>>>> + total_sectors =3D bdrv_getlength(bs) >> BDRV_SECTOR_B= ITS; >>>>>> + if (total_sectors <=3D 0) { >>>>>> + fprintf(stderr, "Error getting length of block de= vice %s\n", device_name); >>>>>> + return -EINVAL; >>>>>> + } >>>>>> + >>>>>> + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHU= NK) { >>>>>> + nr_sectors =3D total_sectors - addr; >>>>>> + } else { >>>>>> + nr_sectors =3D BDRV_SECTORS_PER_DIRTY_CHUNK; >>>>>> + } >>>>>> + >>>>>> buf =3D qemu_malloc(BLOCK_SIZE); >>>>>> >>>>>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>>>>> - ret =3D bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DI= RTY_CHUNK); >>>>>> + ret =3D bdrv_write(bs, addr, buf, nr_sectors); >>>>>> >>>>>> qemu_free(buf); >>>>>> if (ret < 0) { >>>>>> -- >>>>>> 1.7.3.5 >>>>>> >>>>>> >>>>>> >>>>> >>>>> Hi Pierre, >>>>> >>>>> I don't think the fix above is correct. If you have a file which >>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>>>> patch. However, the receiver doesn't know how much sectors which >>>>> the sender wants to be written, so the guest may fail after >>>>> migration because some data may not be written. IIUC, although >>>>> changing bytestream should be prevented as much as possible, we >>>>> should save/load total_sectors to check appropriate file is >>>>> allocated on the receiver side. >>>> >>>> Isn't the guest supposed to be started using a file with the correct= size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there= are valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storag= e. If the partition size is not rounded to 1 MB, the last write will fail= with the current code, and there is no way we can expand the partition. >> >=20 > Right. But in case of partition doesn't the check in the patch below > return error? Does bdrv_getlength return the size correctly? I'm pretty sure that it does. We would have problems in other places if it didn't (e.g. we're checking if I/O requests are within the disk size). Kevin