From: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
To: Kevin Wolf <kwolf@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Pierre Riteau <Pierre.Riteau@irisa.fr>
Subject: Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Date: Fri, 21 Jan 2011 23:18:17 +0900 [thread overview]
Message-ID: <AANLkTinnjfouBFSJJ9rbHMNfp8DQEZ6H+YWtszeWmyrt@mail.gmail.com> (raw)
In-Reply-To: <4D399393.7030506@redhat.com>
2011/1/21 Kevin Wolf <kwolf@redhat.com>:
> Am 21.01.2011 14:59, schrieb Yoshiaki Tamura:
>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>
>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>
>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>
>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>> ---
>>>>>>>>>>> block-migration.c | 16 +++++++++++++++-
>>>>>>>>>>> 1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>> int64_t addr;
>>>>>>>>>>> BlockDriverState *bs;
>>>>>>>>>>> uint8_t *buf;
>>>>>>>>>>> + int64_t total_sectors;
>>>>>>>>>>> + int nr_sectors;
>>>>>>>>>>>
>>>>>>>>>>> do {
>>>>>>>>>>> addr = qemu_get_be64(f);
>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>> return -EINVAL;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>> + if (total_sectors <= 0) {
>>>>>>>>>>> + fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>> + nr_sectors = total_sectors - addr;
>>>>>>>>>>> + } else {
>>>>>>>>>>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>
>>>>>>>>>>> qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>> - ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>> + ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>
>>>>>>>>>>> qemu_free(buf);
>>>>>>>>>>> if (ret < 0) {
>>>>>>>>>>> --
>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Pierre,
>>>>>>>>>>
>>>>>>>>>> I don't think the fix above is correct. If you have a file which
>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>> patch. However, the receiver doesn't know how much sectors which
>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>> migration because some data may not be written. IIUC, although
>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>
>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>
>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>> Can't we expand the image on the fly? We can just abort if expanding
>>>>>>>> failed anyway.
>>>>>>>
>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>
>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>
>>>>>>
>>>>>> Right. But in case of partition doesn't the check in the patch below
>>>>>> return error? Does bdrv_getlength return the size correctly?
>>>>>
>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>
>>>> Sorry for the noise. I just learned it's returning the value of lseek
>>>> in case of raw-posix.
>>>
>>>
>>> And it does a ioctl call on other platforms than Linux.
>>
>> Thanks. Just a quick question regarding total_sectors.
>> BlockDriverState seems to contain total_sectors. Can we avoid
>> calling bdrv_getlength() if bs->total_sectors were already there?
>
> I'd need to check the details, but I think it may not be correct with
> growable files.
Does growable flag mean total_sectors is growable? Because
block-migration require users to preallocate a file w/ enough
size, it doesn't seem to be a problem, IIUC.
Yoshi
>
> Kevin
>
>
next prev parent reply other threads:[~2011-01-21 14:18 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-19 14:59 [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB Pierre Riteau
2011-01-20 2:06 ` Yoshiaki Tamura
2011-01-20 6:49 ` Pierre Riteau
2011-01-20 16:18 ` Yoshiaki Tamura
2011-01-21 8:08 ` Pierre Riteau
2011-01-21 9:11 ` Kevin Wolf
2011-01-21 12:26 ` Yoshiaki Tamura
2011-01-21 12:15 ` Yoshiaki Tamura
2011-01-21 12:31 ` Kevin Wolf
2011-01-21 12:36 ` Yoshiaki Tamura
2011-01-21 12:40 ` Pierre Riteau
2011-01-21 13:59 ` Yoshiaki Tamura
2011-01-21 14:09 ` Kevin Wolf
2011-01-21 14:18 ` Yoshiaki Tamura [this message]
2011-01-21 14:14 ` Pierre Riteau
2011-01-21 14:21 ` Yoshiaki Tamura
2011-01-21 14:23 ` Pierre Riteau
2011-01-21 14:30 ` Yoshiaki Tamura
2011-01-21 14:48 ` Pierre Riteau
2011-01-21 9:16 ` Kevin Wolf
2011-01-21 11:38 ` Pierre Riteau
2011-01-21 11:45 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTinnjfouBFSJJ9rbHMNfp8DQEZ6H+YWtszeWmyrt@mail.gmail.com \
--to=tamura.yoshiaki@lab.ntt.co.jp \
--cc=Pierre.Riteau@irisa.fr \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).