From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47048)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1Ztu9Z-0004LU-Ja
	for qemu-devel@nongnu.org; Wed, 04 Nov 2015 04:14:50 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1Ztu9V-0007DW-F2
	for qemu-devel@nongnu.org; Wed, 04 Nov 2015 04:14:49 -0500
Received: from [59.151.112.132] (port=30650 helo=heian.cn.fujitsu.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1Ztu9U-0007DE-DV
	for qemu-devel@nongnu.org; Wed, 04 Nov 2015 04:14:45 -0500
References: <20151103122353.GB17670@work-vm> <874mh3z1hb.fsf@emacs.mitica>
	<20151103134716.GC17670@work-vm> <5639770B.4090103@cn.fujitsu.com>
	<20151104090525.GA2702@work-vm>
From: Wen Congyang <wency@cn.fujitsu.com>
Message-ID: <5639CC37.7000906@cn.fujitsu.com>
Date: Wed, 4 Nov 2015 17:13:27 +0800
MIME-Version: 1.0
In-Reply-To: <20151104090525.GA2702@work-vm>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] safety of migration_bitmap_extend
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: den@openvz.org, qemu-devel@nongnu.org, lizhijian@cn.fujitsu.com, Juan Quintela <quintela@redhat.com>

On 11/04/2015 05:05 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (wency@cn.fujitsu.com) wrote:
>> On 11/03/2015 09:47 PM, Dr. David Alan Gilbert wrote:
>>> * Juan Quintela (quintela@redhat.com) wrote:
>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>> Hi,
>>>>>   I'm trying to understand why migration_bitmap_extend is correct/safe;
>>>>> If I understand correctly, you're arguing that:
>>>>>
>>>>>   1) the migration_bitmap_mutex around the extend, stops any sync's happening
>>>>>      and so no new bits will be set during the extend.
>>>>>
>>>>>   2) If migration sends a page and clears a bitmap entry, it doesn't
>>>>>      matter if we lose the 'clear' because we're copying it as
>>>>>      we extend it, because losing the clear just means the page
>>>>>      gets resent, and so the data is OK.
>>>>>
>>>>> However, doesn't (2) mean that migration_dirty_pages might be wrong?
>>>>> If a page was sent, the bit cleared, and migration_dirty_pages decremented,
>>>>> then if we copy over that bitmap and 'set' that bit again then migration_dirty_pages
>>>>> is too small; that means that either migration would finish too early,
>>>>> or more likely, migration_dirty_pages would wrap-around -ve and
>>>>> never finish.
>>>>>
>>>>> Is there a reason it's really safe?
>>>>
>>>> No.  It is reasonably safe.  Various values of reasonably.
>>>>
>>>> migration_dirty_pages should never arrive at values near zero.  Because
>>>> we move to the completion stage way before it gets a value near zero.
>>>> (We could have very, very bad luck, as in it is not safe).
>>>
>>> That's only true if we hit the qemu_file_rate_limit() in ram_save_iterate;
>>> if we don't hit the rate limit (e.g. because we're CPU or network limited
>>> to slower than the set limit) then I think ram_save_iterate will go all the
>>> way to sending every page; if that happens it'll go once more
>>> around the main migration loop, and call the pending routine, and now get
>>> a -ve (very +ve) number of pending pages, so continuously do ram_save_iterate
>>> again.
>>>
>>> We've had that type of bug before when we messed up the dirty-pages calculation
>>> during hotplug.
>>
>> IIUC, migration_bitmap_extend() is called when migration is running, and we hotplug
>> a device.
>>
>> In this case, I think we hold the iothread mutex when migration_bitmap_extend() is called.
>>
>> ram_save_complete() is also protected by the iothread mutex.
>>
>> So if migration_bitmap_extend() is called, the migration thread may be blocked in
>> migration_completion() and wait it. qemu_savevm_state_complete() will be called after
>> migration_completion() returns.
> 
> But I don't think ram_save_iterate is protected by that lock, and my concern
> is that the dirty-pages calculation is wrong during the iteration phase, and then
> the iteration phase will never exit and never try and get to ram_save_complete.

Yes, the dirty-pages may be wrong. But it is smaller, not larger than the exact value.
Why will the iteration phase never exit?

Thanks
Wen Congyang

> 
> Dave
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>>> Now, do we really care if migration_dirty_pages is exact?  Not really,
>>>> we just use it to calculate if we should start the throotle or not.
>>>> That only test that each 1 second, so if we have written a couple of
>>>> pages that we are not accounting for, things should be reasonably safe.
>>>>
>>>> Once told that, I don't know why we didn't catch that problem during
>>>> review (yes, I am guilty here).  Not sure how to really fix it,
>>>> thought.  I think that the problem is more theoretical than real, but
>>>
>>> Dave
>>>
>>>> ....
>>>>
>>>> Thanks, Juan.
>>>>
>>>>>
>>>>> Dave
>>>>>
>>>>> --
>>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>