From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Charles Coffing" Subject: Re: vbd flushing during migration? Date: Tue, 01 Aug 2006 15:28:20 -0400 Message-ID: <44CF56C2.D169.003C.0@novell.com> References: <44CE5C89.4070602@hp.com> <44CE83B1.1090605@hp.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=__Part0C298A44.0__=" Return-path: In-Reply-To: <44CE83B1.1090605@hp.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andrew Warfield , John Byrne Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org --=__Part0C298A44.0__= Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline I've got a patch in our tree that does (basically) what John is describing. The exact bug we hit was that a "xm shutdown -w vm" did not wait until the vbds were cleared out before returning. So now I wait until the backend/vbd nodes go away before returning. This could probably be done more cleanly with watches, and should be abstracted out to be sure it applies equally to migration, and so forth. But for the sake of discussion, the patch is attached. -Charles >>> On Mon, Jul 31, 2006 at 4:26 PM, in message <44CE83B1.1090605@hp.com>, John Byrne wrote: > It would be a bit ugly, but mostly straightforward to watch for the > destruction of the vbds (or all devices) after the destroyDomain() is > done and then sending an all- clear. (The last time I looked there wasn't > a waitForDomainDestroy() anywhere, so it would probably be best to write > one.) This would guarantee correctness: which is the most important thing. > > The problem I see with that strategy is the effect on downtime during a > live- move. Ideally you'd like to start the vbd cleanup when the final > suspend is done and hope to parallelize the any final device operations > with the final pass of live- move. How to do that and play nice with > domain destruction on the normal path and handle errors seems a lot less > clear to me. > > So, are you just ignoring the notion of minimizing downtime for the > moment or is there something I'm missing? > > John > > Andrew Warfield wrote: >> It's slightly more than a flush that's required. The migration >> protocol needs to be extended so that execution on the target host >> doesn't start until all of the outstanding (i.e. issued by the >> backend) block requests have been either cancelled or acknowledged. >> This should be pretty straight forward given that the backend driver >> ref counts a blkif's state based on pending requests, and won't tear >> down the backend directory in xenstore until all the outstanding >> requests have cleared. All that is likely required is to have the >> migration code register watches on the backend vbd directories, and >> wait for them to disappear before giving the all- clear to the new >> host. >> >> We've talked about this enough to know how to fix it, but haven't had >> a chance to hack it up. (I think Julian has looked into the problem a >> bit for blktap, but not yet done a general fix.) Patches would >> certainly be welcome though. ;) >> >> a. >> >> On 7/31/06, John Byrne wrote: >>> >>> Hi, >>> >>> I don't see any obvious flush to disk taking place for vbd's on the >>> source host in XendCheckpoint.py before the domain is started on the new >>> host. Is there a guarantee that all written data is on disk somewhere >>> else or is something needed? >>> >>> Thanks, >>> >>> John Byrne >>> >>> >>> _______________________________________________ >>> Xen- devel mailing list >>> Xen- devel@lists.xensource.com >>> http://lists.xensource.com/xen- devel >>> >> > > > _______________________________________________ > Xen- devel mailing list > Xen- devel@lists.xensource.com > http://lists.xensource.com/xen- devel --=__Part0C298A44.0__= Content-Type: application/octet-stream; name="xen-shutdown-wait.diff" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="xen-shutdown-wait.diff" SW5kZXg6IHhlbi11bnN0YWJsZS90b29scy9weXRob24veGVuL3htL3NodXRkb3duLnB5Cj09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT0KLS0tIHhlbi11bnN0YWJsZS5vcmlnL3Rvb2xzL3B5dGhvbi94ZW4veG0vc2h1dGRvd24u cHkKKysrIHhlbi11bnN0YWJsZS90b29scy9weXRob24veGVuL3htL3NodXRkb3duLnB5CkBAIC01 Miw2ICs1Miw4IEBAIGRlZiBzaHV0ZG93bihvcHRzLCBkb21zLCBtb2RlLCB3YWl0KToKICAgICBm b3IgZCBpbiBkb21zOgogICAgICAgICBzZXJ2ZXIueGVuZC5kb21haW4uc2h1dGRvd24oZCwgbW9k ZSkKICAgICBpZiB3YWl0OgorICAgICAgICBmcm9tIHhlbi54ZW5kLnhlbnN0b3JlLnhzdHJhbnNh Y3QgaW1wb3J0IHhzdHJhbnNhY3QKKyAgICAgICAgZG9tc190b19jbGVhbnVwID0gZG9tc1s6XQog ICAgICAgICB3aGlsZSBkb21zOgogICAgICAgICAgICAgYWxpdmUgPSBzZXJ2ZXIueGVuZC5kb21h aW5zKDApCiAgICAgICAgICAgICBkZWFkID0gW10KQEAgLTYyLDYgKzY0LDE3IEBAIGRlZiBzaHV0 ZG93bihvcHRzLCBkb21zLCBtb2RlLCB3YWl0KToKICAgICAgICAgICAgICAgICBvcHRzLmluZm8o IkRvbWFpbiAlcyB0ZXJtaW5hdGVkIiAlIGQpCiAgICAgICAgICAgICAgICAgZG9tcy5yZW1vdmUo ZCkKICAgICAgICAgICAgIHRpbWUuc2xlZXAoMSkKKyAgICAgICAgIyBOb3cgYWxsIHRoZSBkb21h aW5zIGFyZSB0ZXJtaW5hdGVkLCBidXQgd2FpdCB1bnRpbCB0aGUgZGV2aWNlcyBhcmUKKyAgICAg ICAgIyBjbGVhbmVkIHVwLgorICAgICAgICBmb3IgZCBpbiBkb21zX3RvX2NsZWFudXA6CisgICAg ICAgICAgICBpbmZvID0gc2VydmVyLnhlbmQuZG9tYWluKGQpCisgICAgICAgICAgICBkb21pZCA9 IGludChzeHAuY2hpbGRfdmFsdWUoaW5mbywgJ2RvbWlkJywgJy0xJykpCisgICAgICAgICAgICBk ZXZpY2VfY2xhc3NfcGF0aCA9ICcvbG9jYWwvZG9tYWluLzAvYmFja2VuZC92YmQvJWQvJyAlIGRv bWlkCisgICAgICAgICAgICB3aGlsZSBUcnVlOgorICAgICAgICAgICAgICAgIGRldmljZXMgPSB4 c3RyYW5zYWN0Lkxpc3QoZGV2aWNlX2NsYXNzX3BhdGgpCisgICAgICAgICAgICAgICAgaWYgbGVu KGRldmljZXMpID09IDA6CisgICAgICAgICAgICAgICAgICAgIGJyZWFrCisgICAgICAgICAgICAg ICAgdGltZS5zbGVlcCgxKQogICAgICAgICBvcHRzLmluZm8oIkFsbCBkb21haW5zIHRlcm1pbmF0 ZWQiKQogCiBkZWYgc2h1dGRvd25fbW9kZShvcHRzKToK --=__Part0C298A44.0__= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --=__Part0C298A44.0__=--