From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [Patch v2] tools/migrate: Fix regression when
 migrating from older version of Xen
Date: Mon, 22 Jul 2013 19:06:39 +0100
Message-ID: <51ED74AF.9010605@citrix.com>
References: <1373971931-7211-1-git-send-email-andrew.cooper3@citrix.com>
	<CAP8mzPNnjpSt_Ma+7Ab7iwWg0Rpa+kg=9rkww6929nk1ftoBBg@mail.gmail.com>
	<51E8176D.5030809@citrix.com>
	<CAP8mzPNdgUqoi5ue+n+FNwtoEnr1PuoWBy0pYeehhi2grp17SA@mail.gmail.com>
	<51ED7143.6090006@citrix.com>
	<1374515941.6623.30.camel@hastur.hellion.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1374515941.6623.30.camel@hastur.hellion.org.uk>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <ian.campbell@citrix.com>
Cc: rshriram@cs.ubc.ca, Xen-devel <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 22/07/13 18:59, Ian Campbell wrote:
> On Mon, 2013-07-22 at 18:52 +0100, Andrew Cooper wrote:
>> On 18/07/13 19:47, Shriram Rajagopalan wrote:
>>
>>> On Thu, Jul 18, 2013 at 12:27 PM, Andrew Cooper
>>> <andrew.cooper3@citrix.com> wrote:
>>>         On 18/07/13 17:20, Shriram Rajagopalan wrote:
>>>         
>>>         > On Tue, Jul 16, 2013 at 6:52 AM, Andrew Cooper
>>>         > <andrew.cooper3@citrix.com> wrote: 
>>>         > and you setting ctx->last_checkpoint = 0 basically means
>>>         > that you are
>>>         > banking on the far end (with older version of tools) to
>>>         > close the socket, causing
>>>         > "get pagebuf" to fail and subsequently land on finish.
>>>         > IIRC, this was the behavior before
>>>         > XC_SAVE_ID_LAST_CHECKPOINT was introduced
>>>         > by Ian Campbell, to get rid of this benign error message.
>>>         > 
>>>         
>>>         
>>>         That might have been 'fine' for PV guests, but it cause
>>>         active breakage for HVM domains where the qemu save record
>>>         immediately follows in the migration stream.
>>>         
>>>         
>>>
>>>
>>> Just to clarify.. the code flows like this, iirc.
>>>
>>>
>>> loadpages:
>>>  while (1)
>>>      if !completed
>>>         get pagebufs
>>>         if 0 pages sent, break
>>>      endif
>>>      apply batch (pagebufs)
>>>  endwhile
>>>
>>>
>>>  if !completed
>>>    get tailbuf [[this is where the QEMU record would be obtained]]
>>>    completed = 1
>>>  endif
>>>
>>>
>>>  if last_checkpoint
>>>    goto finish
>>>  endif
>>>
>>>
>>>  get pagebuf or goto finish on error ---> this is where old code
>>> used to exit
>>>  get tailbuf
>>> goto loadpages
>>> finish:
>>>    apply tailbuf [tailbuf obtained inside the 'if !completed' block]
>>>    do the rest of the restore
>> (Logically joining the two divergent threads, as this is the answer to
>> both)
>>
>> This has nothing to do with the buffering mode, and that is not where
>> the Qemu record would be obtained from.
>>
>> As the code currently stands, if a XC_SAVE_ID_LAST_CHECKPOINT chunk is
>> not seen in the stream, we complete the loadpages: section, including
>> the magic pages (TSS, console info etc).
>>
>> We then read the buffer_tail() on line 171, set ctx->completed on line
>> 1725, but fail the ctx->last_checkpoint check on line 1758.
>>
>> What we should do is pass the last_checkpoint test, and goto finish
>> which then calls dump_qemu().  What actually happens is a call to
>> pagebuf_get() on line 1766 which raises an error because of finding a
>> Qemu save record rather than more pages.
>>
>> So this is very much a bug directly introduced by 00a4b65f85, and can
>> only be fixed with knowledge from the higher levels of the toolstack.
>>
>> As for the wording of the parameter, I still prefer the original
>> "last_checkpoint_unaware" over "checkpointed_stream" as it is more
>> accurate.
>>
>> Any migration stream started from a version of the tools after c/s
>> 00a4b65f85 will work, whether it is checkpointed or not (as the
>> XC_SAVE_ID_LAST_CHECKPOINT chunk will be sent in the correct place).
>> Any migration started from a version of the tools before c/s
>> 00a4b65f85 will fail because it has no idea that the receiving end is
>> expecting it to insert XC_SAVE_ID_LAST_CHECKPOINT chunk.  The fault
>> here is with the receiving end expecting to find a
>> XC_SAVE_ID_LAST_CHECKPOINT chunk.
>>
>> The only fix is for newer toolstack to be aware that the migration
>> stream is from an older toolstack, and to set
>> last_checkpoint_unaware=1, which will set ctx->last_checkpoint to 1,
>> allowing the receiving side of the migration to correctly read the
>> migration stream.
> The toolstack cannot in general know that (i.e. how does xl know?)
>
> It does know if it is doing checkpointing or not though.
>
> There are four cases:
>
> Regular migrate from pre-00a4b65f85 to post-00a4b65f85. Receiver sets
> last_checkpoint = 1 at start of day. Doesn't care that sender never
> sends XC_SAVE_ID_LAST_CHECKPOINT.
>
> Regular migrate from post-00a4b65f85 to post-00a4b65f85. Receiver sets
> last_checkpoint = 1 at start of day. Doesn't care that it sees
> XC_SAVE_ID_LAST_CHECKPOINT again, just ends up (pointlessly) setting
> last_checkpoint = 1.
>
> Checkpointed migrate from post-00a4b65f85 to post-00a4b65f85. Receiver
> doesn't set last_checkpoint = 1 at start of day.
> XC_SAVE_ID_LAST_CHECKPOINT is sent at the right time and both sides
> agree etc. Things work.
>
> Checkpointed migrate from pre-00a4b65f85 to post-00a4b65f85. We don't
> support this, because checkpoints should only be supported between like
> versions of Xen (N->N+1 case doesn't apply).
>
> Ian.
>
>

Ok - that seems to make it easier.  I will see about plumbing the
correct information through from xend/remus.

~Andrew