From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46022)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1adFHv-0007aS-SS
	for qemu-devel@nongnu.org; Tue, 08 Mar 2016 05:54:53 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1adFHs-00036p-Le
	for qemu-devel@nongnu.org; Tue, 08 Mar 2016 05:54:51 -0500
Received: from mx2.parallels.com ([199.115.105.18]:41988)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1adFHs-00036I-DL
	for qemu-devel@nongnu.org; Tue, 08 Mar 2016 05:54:48 -0500
References: <cover.1456212545.git.amit.shah@redhat.com>
	<33f7c8c309e6625942e6b8548faa96606a6f99b1.1456212545.git.amit.shah@redhat.com>
	<20160307124911.GB2253@work-vm> <56DDCF55.6000203@openvz.org>
	<20160308104558.GA2251@work-vm>
From: "Denis V. Lunev" <den@openvz.org>
Message-ID: <56DEAF69.1070607@openvz.org>
Date: Tue, 8 Mar 2016 13:54:33 +0300
MIME-Version: 1.0
In-Reply-To: <20160308104558.GA2251@work-vm>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PULL 2/5] migration: move
 bdrv_invalidate_cache_all of of coroutine context
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, qemu list <qemu-devel@nongnu.org>, Juan Quintela <quintela@redhat.com>

On 03/08/2016 01:45 PM, Dr. David Alan Gilbert wrote:
> * Denis V. Lunev (den@openvz.org) wrote:
>> On 03/07/2016 03:49 PM, Dr. David Alan Gilbert wrote:
>>> * Amit Shah (amit.shah@redhat.com) wrote:
>>>> From: "Denis V. Lunev" <den@openvz.org>
>>>>
>>>> There is a possibility to hit an assert in qcow2_get_specific_info that
>>>> s->qcow_version is undefined. This happens when VM in starting from
>>>> suspended state, i.e. it processes incoming migration, and in the same
>>>> time 'info block' is called.
>>>>
>>>> The problem is that qcow2_invalidate_cache() closes the image and
>>>> memset()s BDRVQcowState in the middle.
>>>>
>>>> The patch moves processing of bdrv_invalidate_cache_all out of
>>>> coroutine context for postcopy migration to avoid that. This function
>>>> is called with the following stack:
>>>>    process_incoming_migration_co
>>>>    qemu_loadvm_state
>>>>    qemu_loadvm_state_main
>>>>    loadvm_process_command
>>>>    loadvm_postcopy_handle_run
>>>>
>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> hmm; actually - this segs in a variety of different ways;
>>> there are two problems:
>>>
>>>     a) +    bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, NULL);
>>>       That's the easy one; that NULL should be 'mis', because
>>>       the bh is expecting to use it as a MigrationIncomingState
>>>       so it segs fairly reliably in the qemu_bh_delete(mis->bh)
>>>
>>>     b) The harder problem is that there's a race where qemu_bh_delete
>>>        segs, and I'm not 100% sure why yet - it only does it sometime
>>>        (i.e. run virt-test and leave it and it occasionally does it).
>>>        From the core it looks like qemu->bh is corrupt (0x10101010...)
>>>        so maybe mis has been freed at that point?
>>>        I'm suspecting this is the postcopy_ram_listen_thread freeing
>>>        mis at the end of it, but I don't know yet.
>>>
>>> Dave
>> Yes. this is exactly use-after-free. I have looked into the code
>> and this seems correct.
>>
>> Could you try this simple patch?
> Hmm no, that's not right.
> The order for postcopy is that we are running the listen thread and then
> receive the 'run', and the listening thread is still running - so you
> can't destroy the incoming state during the run.
> It can't get destroyed until both the main thread has finished loading
> the migration AND the listen thread has finished.
>
> Hmm - that does give me an idea about the other seg I saw; I need to check it;
> but I think the problem is probably the case of a very short postcopy
> where the listen thread exits before the handle_run_bh is triggered;
> (and since I've only seen it in my virt-test setup, and I know it can do
> very short postcopies)
> I think the fix here is to pass loadvm_postcopy_handle_run_bh a pointer to it's
> own bh structure rather than store it in mis->bh; that way it doesn't use mis
> at all.
>
> Dave
>
>> Den
>>
>>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 96e7db5..9a020ef 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1446,15 +1446,6 @@ static void *postcopy_ram_listen_thread(void *opaque)
>>   
>>       migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>>                                      MIGRATION_STATUS_COMPLETED);
>> -    /*
>> -     * If everything has worked fine, then the main thread has waited
>> -     * for us to start, and we're the last use of the mis.
>> -     * (If something broke then qemu will have to exit anyway since it's
>> -     * got a bad migration state).
>> -     */
>> -    migration_incoming_state_destroy();
>> -
>> -
>>       return NULL;
>>   }
>>   
>> @@ -1533,6 +1524,14 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
>>       }
>>   
>>       qemu_bh_delete(mis->bh);
>> +
>> +    /*
>> +     * If everything has worked fine, then the main thread has waited
>> +     * for us to start, and we're the last use of the mis.
>> +     * (If something broke then qemu will have to exit anyway since it's
>> +     * got a bad migration state).
>> +     */
>> +    migration_incoming_state_destroy();
>>   }
>>   
>>   /* After all discards we can start running and asking for pages */
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
This will help for sure. The idea to reuse migration state seems wrong.

Den