From mboxrd@z Thu Jan  1 00:00:00 1970
From: John Byrne <john.l.byrne@hp.com>
Subject: Re: vbd flushing during migration?
Date: Mon, 31 Jul 2006 15:26:57 -0700
Message-ID: <44CE83B1.1090605@hp.com>
References: <44CE5C89.4070602@hp.com>
	<eacc82a40607311256s79c6b2a8tbdae53f6761fd39@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <eacc82a40607311256s79c6b2a8tbdae53f6761fd39@mail.gmail.com>
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Andrew Warfield <andrew.warfield@cl.cam.ac.uk>
Cc: xen-devel <xen-devel@lists.xensource.com>
List-Id: xen-devel@lists.xenproject.org

It would be a bit ugly, but mostly straightforward to watch for the 
destruction of the vbds (or all devices) after the destroyDomain() is 
done and then sending an all-clear. (The last time I looked there wasn't 
a waitForDomainDestroy() anywhere, so it would probably be best to write 
one.) This would guarantee correctness: which is the most important thing.

The problem I see with that strategy is the effect on downtime during a 
live-move. Ideally you'd like to start the vbd cleanup when the final 
suspend is done and hope to parallelize the any final device operations 
with the final pass of live-move. How to do that and play nice with 
domain destruction on the normal path and handle errors seems a lot less 
clear to me.

So, are you just ignoring the notion of minimizing downtime for the 
moment or is there something I'm missing?

John

Andrew Warfield wrote:
> It's slightly more than a flush that's required.  The migration
> protocol needs to be extended so that execution on the target host
> doesn't start until all of the outstanding (i.e. issued by the
> backend) block requests have been either cancelled or acknowledged.
> This should be pretty straight forward given that the backend driver
> ref counts a blkif's state based on pending requests, and won't tear
> down the backend directory in xenstore until all the outstanding
> requests have cleared.  All that is likely required is to have the
> migration code register watches on the backend vbd directories, and
> wait for them to disappear before giving the all-clear to the new
> host.
> 
> We've talked about this enough to know how to fix it, but haven't had
> a chance to hack it up.  (I think Julian has looked into the problem a
> bit for blktap, but not yet done a general fix.) Patches would
> certainly be welcome though. ;)
> 
> a.
> 
> On 7/31/06, John Byrne <john.l.byrne@hp.com> wrote:
>>
>> Hi,
>>
>> I don't see any obvious flush to disk taking place for vbd's on the
>> source host in XendCheckpoint.py before the domain is started on the new
>> host. Is there a guarantee that all written data is on disk somewhere
>> else or is something needed?
>>
>> Thanks,
>>
>> John Byrne
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>