qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>,
	Gleb Natapov <gleb@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Luonengjun <luonengjun@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Zengjunliang <zengjunliang@huawei.com>,
	"Wangrui (K)" <moon.wangrui@huawei.com>
Subject: Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task while the previous one still exist
Date: Wed, 06 Nov 2013 10:04:36 +0100	[thread overview]
Message-ID: <527A0624.3040600@redhat.com> (raw)
In-Reply-To: <D3E216785288A145B7BC975F83A2ED1043EF1A15@szxeml556-mbx.china.huawei.com>

Il 06/11/2013 02:50, Zhanghaoyu (A) ha scritto:
>>>>> Avoid starting a new migration task while the previous one still
>>>> exist.
>>>>
>>>> Can you explain how to reproduce the problem?
>>>>
>>> When network disconnection between source and destination happened, 
>>> the migration thread stuck at below stack,
>>> Then I cancel the migration task, the migration state in qemu will be set to MIG_STATE_CANCELLED, so the migration job in libvirt quits.
>>> Then I perform migration again, at this time, the network reconnected 
>>> successfully, since the TCP timeout retransmission, above stack will not return immediately, so two migration tasks exist at the same time.
>>> And still worse, source qemu will crash, because of accessing the NULL 
>>> pointer in qemu_bh_schedule(s->cleanup_bh); statement in latter migration task, since the "s->cleanup_bh" had been deleted by previous migration task.
>>
>> Thanks for explaining.  CANCELLING looks like a useful addition.
>>
>> Why do you need both CANCELLING and COMPLETING?  The COMPLETED state should be set only after all I/O is done.
>
> There is a period of time from the timing of setting COMPLETED state to that of migration task exits,
> so it's problematic in COMPLETED->CANCELLED transition, but if applying your below proposal, the problem gone.
>
> do {
>     old_state = s->state;
>     if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
>         break;
>     }
>     migrate_set_state(s, old_state, MIG_STATE_CANCELLED);
> } while (s->state != MIG_STATE_CANCELLED);

Ok.

>> I agree with Eric that the CANCELLING state should not be exposed via QMP.
>> "info migrate" and "query-migrate" can keep showing "active" for maximum backwards compatibility.
>>
>> More comments below.
>>
>>
>>> -    if (s->state != MIG_STATE_COMPLETED) {
>>> +    if (s->state != MIG_STATE_COMPLETING) {
>>>          qemu_savevm_state_cancel();
>>> +        if (s->state == MIG_STATE_CANCELLING) {
>>> +            migrate_set_state(s, MIG_STATE_CANCELLING, MIG_STATE_CANCELLED); 
>>> +        }
>>
>> I think you can remove the "if" and unconditionally call migrate_set_state.
> 
> Do you mean to remove the "if (s->state == MIG_STATE_CANCELLING)" ?
> The s->state probably is MIG_STATE_ERROR here, is it okay to unconditionally call migrate_set_state?

migrate_set_state has atomic_cmpxchg so it has an "implicit" if, but
you're right it's clearer this way.

Paolo

> Thanks,
> Zhang Haoyu
> 
>>
>>> +    }else {
>>> +        migrate_set_state(s, MIG_STATE_COMPLETING, 
>>> + MIG_STATE_COMPLETED);
>>>      }
>>>  
>>>      notifier_list_notify(&migration_state_notifiers, s);  }
>>>  
>>> -static void migrate_set_state(MigrationState *s, int old_state, int 
>>> new_state) -{
>>> -    if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
>>> -        trace_migrate_set_state(new_state);
>>> -    }
>>> -}
>>> -
>>>  void migrate_fd_error(MigrationState *s)  {
>>>      DPRINTF("setting error state\n"); @@ -328,7 +337,7 @@ static void 
>>> migrate_fd_cancel(MigrationState *s)  {
>>>      DPRINTF("cancelling migration\n");
>>>  
>>> -    migrate_set_state(s, s->state, MIG_STATE_CANCELLED);
>>> +    migrate_set_state(s, s->state, MIG_STATE_CANCELLING);
>>
>> Here probably we want something like
>>
>>    do {
>>        old_state = s->state;
>>        if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
>>            break;
>>        }
>>        migrate_set_state(s, old_state, MIG_STATE_CANCELLING);
>>    } while (s->state != MIG_STATE_CANCELLING);
>>
>> to avoid a bogus COMPLETED->CANCELLED transition.  Please separate the patch in two parts:
>>
>> (1) the first uses the above code, with CANCELLED instead of CANCELLING
>>
>> (2) the second, similar to the one you have posted, introduces the new CANCELLING state
>>
>> Thanks,
>>
>> Paolo

  reply	other threads:[~2013-11-06  9:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-04 11:26 [Qemu-devel] [PATCH] migration: avoid starting a new migration task while the previous one still exist Zhanghaoyu (A)
2013-11-04 11:30 ` Paolo Bonzini
2013-11-05  2:23   ` Zhanghaoyu (A)
2013-11-05  9:15     ` Paolo Bonzini
2013-11-06  1:50       ` Zhanghaoyu (A)
2013-11-06  9:04         ` Paolo Bonzini [this message]
2013-11-04 13:59 ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=527A0624.3040600@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=gleb@redhat.com \
    --cc=haoyu.zhang@huawei.com \
    --cc=luonengjun@huawei.com \
    --cc=moon.wangrui@huawei.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=weidong.huang@huawei.com \
    --cc=zengjunliang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).