From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:48526) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qlgta-0001CW-T0 for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:37:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QlgtZ-00072H-Ge for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:37:58 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:36437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QlgtZ-00072D-Al for qemu-devel@nongnu.org; Tue, 26 Jul 2011 08:37:57 -0400 Received: by gxk26 with SMTP id 26so243681gxk.4 for ; Tue, 26 Jul 2011 05:37:56 -0700 (PDT) Message-ID: <4E2EB522.2000808@codemonkey.ws> Date: Tue, 26 Jul 2011 07:37:54 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1309448777-1447-1-git-send-email-pbonzini@redhat.com> <4E2DFAE5.1050304@codemonkey.ws> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migration format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com Cc: Ryan Harper , Paolo Bonzini , qemu-devel@nongnu.org, Stefan Hajnoczi , mst@redhat.com On 07/26/2011 07:07 AM, Juan Quintela wrote: > Anthony Liguori wrote: >> == What we need == >> >> We need to decompose migration into three different problems: 1) >> serializing device state 2) transforming the device model in order to >> satisfy forwards and backwards compatibility 3) encoding the >> serialized device model on the wire. > > I will change this to: > - We need to be able to "enable/disable" features of a device. > A.K.A. make -M pc-0.14 work with devices with the same features > than 0.14. Notice that this is _independent_ of migration. In theory, we already have this with qdev flags. > - Be able to describe that different features/versions. This is not the > difficult part, it can be subsections, optional fields, whatever. > What is the difficult part is _knowing_ what fields needs to be on > each version. That again depends of the device, not migration. > > - Be able to to do forward/bacward compatibility (and without > comunication both sides is basically impossible). Hrm, I'm not sure I agree with these conclusions. Management tools should do their best job to create two compatible device models. Given two compatible device models, there *may* be differences in the structure of the device models since we evolve things over time. We may rename a field, change the type, etc. To support this, we can use filters both on the destination and receive end to do our best to massage the device model into something compatible. But creating two creating compatible device models is not the job of the migration protocol. It's the job of management tools. > - Send things on the wire (really this is the easy part, we can play > with it touching only migration functions.). > >> We also need a way to future proof ourselves. > > We have been very bad at this. Automatic checking is the only way that > I can think of. I don't know what you mean by automatic checking. >> == What we can do == >> >> 1) Add migration capabilities to future proof ourselves. I think the >> simplest way this would work is to have a >> query-migration-capabilities' command that returned a bitmask of >> supported migration features. I think we also introduce a >> set-migration-capabilities' command that can mask some of the >> supported features. > > We have two things here. Device level& protocol level. > > Device level: very late to set anything. > Protocol level: we can set things here, but notice that only a few > things cane be set here. Once we have a protocol level feature bit, we can add device level feature bits as a new feature. >> A management tool would query-migration features on the source and >> destination, take the intersection of the two masks, and set that mask >> on both the source and destination. >> >> Lack of support for these commands indicates a mask of zero which is >> the protocol we offer today. >> >> 2) Switch to a visitor model to serialize device state. This involves >> converting any occurance of: >> >> qemu_put_be32(f, port->guest_connected); >> >> To: >> >> visit_type_u32(v, "guest_connected",&port->guest_connected,&local_err); > > VMSTATE_INT32(guest_conected, FooState) > > can be make to do this at any point. > >> It's 100% mechanical and makes absolutely no logic change. It works >> equally well with legacy and VMstate migration handlers. >> >> 3) Add a Visitor class that operates on QEMUFile. >> >> At this state, we can migrate to data structures. That means we can >> migrate to QEMUFile, QObjects, or JSON. We could change the protocol >> at this stage to something that was still binary but had section sizes >> and things of that nature. > > That was the whole point of vmstate. The problem with vmstate is that it's an all or nothing thing and the conversion isn't programmatic. Since visiting and qemu_put match 1-1, we can do the conversion all-at-once with some sed magic. >> So if we did this in 1.0, we could have a single function that >> converted the 1.0 device model to 1.1 and vice versa, and we'd be >> fine. We wouldn't have to touch 200 devices to do this. > > I still think this is wrong. We are launching a device with feature > "foo", and at migration time, we want to migration without feature > "foo". This is not going to work on the general case. But launching > the device _without_ feature "foo" will always work. Don't confuse migration with creating compatible device models. We're never going to support migrating from a system with an e1000 to a system with virtio :-) > Notice the things that "can" be optional: > - features that are not used. We update the device to have more > features, but OS driver only uses the features of the old version. > With subsections test, we can fix this one. > > - values that are only needed sometimes. PIO subsection cames to mind, > it is only needed when we are on the middle of a PIO operation. > > - values that rarely change for defaults. This the mmio addresess > problems with rtl8139. If we plug/unplug the card, we will get a > different address, so we need to change it. > > - values that depend of other features (change default size of memory, > add new variables, etc). This is for its very nature not compatible, > and we can't migrate. > > What I am complaining here? This "compatibility" support supposes that > migration works as: > > device with some features -> migration -> device with other features > > and it works. This means that "migration" does magic, and this is never > going to work. > > Until now, this kind of worked because we only supported migration from > old -> new, or the same version. Migration from old -> new can never > have new features. But from new -> old to work, we need a way to > disable the new features. That is completely independent of migration. At startup time, not dynamically. And we have this, that's what -M pc-X is about. > >> 5) Once we're here, we can implement the next 5-year format. That >> could be ASN.1 and be bidirectional or whatever makes the most sense. >> We could support 50 formats if we wanted to. As long as the transport >> is distinct from the serialization and compat routines, it really >> doesn't matter. > > This means finishing the VMState support, once there, only thing needs > to change is "copy" the savevm, and change the "visitors" to whatever > else that we need/want. There's no need to "finish" VMState to convert to visitors. It's just sed -e 's:qemu_put_be32:visit_type_int32:g' Regards, Anthony Liguori > > Later, Juan.