From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:35218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qlh6R-0007ao-DL for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:51:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qlh6P-0000qt-Vk for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:51:15 -0400 Received: from mail-yx0-f173.google.com ([209.85.213.173]:38403) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qlh6P-0000qk-SC for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:51:13 -0400 Received: by yxt3 with SMTP id 3so266111yxt.4 for ; Tue, 26 Jul 2011 05:51:13 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20110726094846.GA3316@stefanha-thinkpad.localdomain> References: <1309448777-1447-1-git-send-email-pbonzini@redhat.com> <4E2DFAE5.1050304@codemonkey.ws> <20110726094846.GA3316@stefanha-thinkpad.localdomain> Date: Tue, 26 Jul 2011 13:51:13 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migration format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Ryan Harper , mst@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org, Paolo Bonzini On Tue, Jul 26, 2011 at 10:48 AM, Stefan Hajnoczi wrote: > On Mon, Jul 25, 2011 at 06:23:17PM -0500, Anthony Liguori wrote: >> On 07/25/2011 04:10 PM, Paolo Bonzini wrote: >> >On Thu, Jun 30, 2011 at 17:46, Paolo Bonzini =A0wr= ote: >> >>With the current migration format, VMS_STRUCTs with subsections >> >>are ambiguous. =A0The protocol cannot tell whether a 0x5 byte after >> >>the VMS_STRUCT is a subsection or part of the parent data stream. >> >>In the past QEMU assumed it was always a part of a subsection; after >> >>commit eb60260 (savevm: fix corruption in vmstate_subsection_load(), >> >>2011-02-03) the choice depends on whether the VMS_STRUCT has subsectio= ns >> >>defined. >> >> >> >>Unfortunately, this means that if a destination has no subsections >> >>defined for the struct, it will happily read subsection data into >> >>its own fields. =A0And if you are "lucky" enough to stumble on a >> >>zero byte at the right time, it will be interpreted as QEMU_VM_EOF >> >>and migration will be interrupted with half-loaded state. >> >> >> >>There is no way out of this except defining an incompatible >> >>migration protocol. =A0Not-so-long-term we should really try to define >> >>one that is not a joke, but the bug is serious so we need a solution >> >>for 0.15. =A0A sentinel at the end of embedded structs does remove the >> >>ambiguity. >> >> >> >>Of course, this can be restricted to new machine models, and this >> >>is what the patch series does. =A0(And note that only patch 3 is speci= fic >> >>to the short-term solution, everything else is entirely generic). >> >> >> >>Untested beyond compilation. >> > >> >I have now tested this series (exactly as sent) both by examining >> >manually the differences between the two formats on the same guest >> >state, and by a mix of saves/restores (new on new, 0.14 on new >> >pc-0.14, new pc-0.14 on 0.14; also the same combinations on RHEL). =A0I= t >> >always does what is expected. >> > >> >Michael Tsirkin objected that the format should be passed as a >> >parameter in the migrate command. =A0I kind of agree, however since thi= s >> >is a real bug you would need to bump the default for new machine >> >types, and this default would still go in the QEMUMachine struct like >> >I am doing. =A0So I consider the two settings to be orthogonal. =A0Also= , >> >the alternative requires changes to the whole management stack and if >> >the default is not changed it imposes a broken format unless you >> >update the management tools. =A0Clearly much less bang for the buck. >> > >> >I think this is ready to go into 0.15. >> >> I'll take a look for 0.15. >> >> >The bug happens when migrating >> >to 0.14 a pc-0.14 machine created with QEMU 0.15 and which has a >> >floppy. =A0The media changed subsection is almost always included, and >> >this causes problems when migrating to 0.14 which didn't have any >> >subsection for the floppy device. =A0While QEMU support for migration t= o >> >old version admittedly depends on luck, this isn't true of certain >> >downstreams :) which would like to have an unambiguous migration >> >format. >> >> So this got me thinking about where we're at with migration and >> where we need to go. >> >> I actually think there might be a reasonable path forward if we >> attack the problem differently than we have so far. >> >> =3D=3D Today =3D=3D >> >> Today we only support generating the latest serialization of >> devices. To increase the probability of the latest version working >> on older versions of QEMU, we strategically omit fields that we know >> can safely be omitted with older versions (subsections). =A0More than >> likely, migrating new to old won't work. >> >> Migrating old to new is more likely to work. =A0We version each >> section in order to be able to identify when we're dealing with old. >> >> But all of this logic lives in one of two forms. =A0Either as a >> savevm/loadvm callback that takes a QEMUFile and writes byte >> serialization to the stream in an open way (usually big endian) or >> encoded declaratively in a VMState section. >> >> =3D=3D What we need =3D=3D >> >> We need to decompose migration into three different problems: 1) >> serializing device state 2) transforming the device model in order >> to satisfy forwards and backwards compatibility 3) encoding the >> serialized device model on the wire. >> >> We also need a way to future proof ourselves. >> >> =3D=3D What we can do =3D=3D >> >> 1) Add migration capabilities to future proof ourselves. =A0I think >> the simplest way this would work is to have a >> 'query-migration-capabilities' command that returned a bitmask of >> supported migration features. =A0I think we also introduce a >> 'set-migration-capabilities' command that can mask some of the >> supported features. >> >> A management tool would query-migration features on the source and >> destination, take the intersection of the two masks, and set that >> mask on both the source and destination. >> >> Lack of support for these commands indicates a mask of zero which is >> the protocol we offer today. > > When the management tool drives negotiation it is possible to do nice > error reporting (each capability bit has a meaning and detailed > incompatibility errors can be generated). > > However, doing so imposes extra work on management tools - they need to > understand and drive negotiation. =A0If QEMU adds a new capability we > might even need to update management tools! > > As a management tool author I would prefer the source and destination to > work it out amongst themselves so that I just issue the 'migrate' > command. =A0Negotiation can be done without the management tool's > involvement: fail migration if the initial negotation phase fails. An advantage I didn't think of was that management tools handling negotiation makes negotiation out-of-band and the migration protocol doesn't need to be changed. It seems like the migration protocol needs an overhaul sooner or later anyway, so perhaps it's not work making the negotiation external. Stefan