From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45787) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wkufr-0006tz-Bu for qemu-devel@nongnu.org; Thu, 15 May 2014 08:22:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wkufl-0000xi-4X for qemu-devel@nongnu.org; Thu, 15 May 2014 08:22:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63200) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wkufk-0000xS-Sk for qemu-devel@nongnu.org; Thu, 15 May 2014 08:22:05 -0400 Date: Thu, 15 May 2014 15:20:52 +0300 From: "Michael S. Tsirkin" Message-ID: <20140515122052.GA23018@redhat.com> References: <20140515062351.GB14456@redhat.com> <20140515064635.GB31192@grmbl.mre> <20140515090449.2db0cbe0@bahia.local> <537486D2.1060609@suse.de> <20140515095208.GB16405@redhat.com> <53748FC1.5010702@suse.de> <20140515100309.GA22407@redhat.com> <537492C0.9030100@suse.de> <20140515101651.GB22512@redhat.com> <5374AC4F.4000507@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <5374AC4F.4000507@suse.de> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andreas =?iso-8859-1?Q?F=E4rber?= Cc: Kevin Wolf , Fam Zheng , Stefan Hajnoczi , Juan Quintela , Alexander Graf , qemu-devel@nongnu.org, Anthony Liguori , Amit Shah , Paolo Bonzini , Greg Kurz On Thu, May 15, 2014 at 02:00:15PM +0200, Andreas F=E4rber wrote: > Am 15.05.2014 12:16, schrieb Michael S. Tsirkin: > > On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas F=E4rber wrote: > >> Am 15.05.2014 12:03, schrieb Michael S. Tsirkin: > >>> On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas F=E4rber wrote: > >>>> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin: > >>>>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas F=E4rber wrote: > >>>>>> Am 15.05.2014 09:04, schrieb Greg Kurz: > >>>>>>> On Thu, 15 May 2014 12:16:35 +0530 > >>>>>>> Amit Shah wrote: > >>>>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote: > >>>>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote: > >>>>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote: > >>>>>>>>>>> Since each virtio device is streamed in its own section, th= e idea is to > >>>>>>>>>>> stream subsections between the end of the device section an= d the start > >>>>>>>>>>> of the next sections. This allows an older QEMU to complain= and exit > >>>>>>>>>>> when fed with subsections: > >>>>>>>>>>> > >>>>>>>>>>> Unknown savevm section type 5 > >>>>>>>>>>> Error -22 while loading VM state > >>>>>>>>>> > >>>>>>>>>> Please make this configurable -- either via configure or dev= ice > >>>>>>>>>> properties. That avoids having to break existing configurat= ions that > >>>>>>>>>> work without this patch. > >>>>>> > >>>>>> Since backwards migration is not supported upstream, wouldn't it= be > >>>>>> easiest to just add support for the subsection marker and skippi= ng to > >>>>>> the end of section in that downstream? > >>>>> > >>>>> Backwards and forwards migration need to be supported, > >>>>> customers told us repeatedly. So some downstreams support this > >>>>> and not supporting it upstream just means downstreams need > >>>>> to do their own thing. > >>>>> > >>>>> As importantly, ping-pong migration is the only > >>>>> reliable way to stress migration. > >>>>> > >>>>> So if we want to test cross-version we need it to work > >>>>> both way. > >>>>> > >>>>> Finally, the real issue and difficulty with cross-version migrati= on is > >>>>> making VM behave in a backwards compatible way. Serializing in a > >>>>> compatible way is a trivial problem, or would be if the code wasn= 't a > >>>>> mess :) Once you do the hard part, breaking migration because of = the > >>>>> trivial serialization issue is just silly. And special-casing fo= rward > >>>>> migration does not make code simpler, it really only leads to > >>>>> proliferation of hacks and lack of symmetry. > >>>>> > >>>>> So yes it's a useful feature, and no not supporting it does > >>>>> not help anyway. > >>>> > >>>> It seems you misunderstand. I was not saying it's not useful. > >>>> > >>>> My point is that VMStateSubsections added in newer versions (and t= hus > >>>> not existing in older versions) need to be handled for any > >>>> VMState-converted devices. So why can't we make that work for virt= io too? > >>> > >>> Sure. > >>> AFAIK the way to this works is by adding an "field_exists" callback= , > >>> and not sending the section when we are in a compat mode. > >> > >> OK, but upstream always sends the latest version today. > >> So isn't that > >> just two ifs that you would need to add in save and load functions a= dded > >> here for the downstream? x86_64 is unaffected from ppc's endianness > >> changes and you still do ppc64 BE, so there's no real functional pro= blem > >> for RHEL, is there? > >=20 > > I'm sorry I don't understand what you are saying here. > > Simply put, if you specify a compatible -M then your > > device should behave, and migrate, exactly like and old > > qemu did. >=20 > Whatever the version_id fields are for, upstream QEMU *always* saves th= e > newest version_id format that it knows. There is no mechanism that I'm > aware of in upstream QEMU for migrating device fields dependent on -M. > So a device in QEMU only migrates exactly like an old QEMU did if > neither fields nor subsections were added. Ah you are talking about version ids. Yes, they can not support cross-version migration. For this reason these should only be incremented as the last resort, normally new fields (possibly subsections) should be added, with field_exists callback checking for compatibility mode. > Subsections are usually migrated based on whether the state differs fro= m > the default state (didn't check whether the final patch does this > correctly here, but doesn't matter for 1/8 concept). Yes, this would be fine, but it's not what's implemented IIUC. > So for x86 the > subsection should *never* get written and thus not be a problem for you. > For ppc64 it should not get written either, unless you care about > migration from SLES12 or upstream ppc64le to RHEL ppc64. > As I understood the IRC discussion Alex and me had with Greg about this= , > this is copying the exact code VMState uses to write its subsections, s= o > there would be no forward migration problems at all once we convert to > VMState and VMStateSubsections. The only question remaining is how does > your downstream react when it encounters a subsection marker for > subsections it doesn't know - which is not something upstream can solve > for you unless you contribute backwards migration support upstream. >=20 > Don't forget that Greg is introducing subsection support *because of* > your backwards migration wish, so telling him not to add subsections > because of backwards migration is really weird! I don't really care whether it's a subsection or just plain qemu_put_byte, but I'm not saying not to use subsection either. I am merely saying this should have "exists" callback checking for compability mode for old machine types. > Either subsections work > with your downstream or they don't; if they don't, then you have much > larger problems than can be solved by blocking this one section for ppc. > Therefore I was saying if your downstream forgot to handle the case of = a > non-VMState device having subsections, then it may be easier to fix > exactly that than make Greg jump through more hoops here for a > theoretical problem you are unlikely to encounter on your downstream. >=20 > Andreas I only mentioned downstream because it's a requirement coming from users most of whom never interact with upstream. There shouldn't be any difference and people building clusters using upstream qemu will have same needs: clusters can't be completely brought down for version upgrades, so need to be able to run a mix of versions, and be able to migrate in any direction while in the process. > --=20 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany > GF: Jeff Hawn, Jennifer Guild, Felix Imend=F6rffer; HRB 16746 AG N=FCrn= berg