From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:50535) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNnfR-0007M3-QM for qemu-devel@nongnu.org; Tue, 08 Nov 2011 10:32:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RNnfQ-0001oR-0B for qemu-devel@nongnu.org; Tue, 08 Nov 2011 10:32:53 -0500 Received: from mail-yw0-f45.google.com ([209.85.213.45]:50619) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNnfP-0001oH-SU for qemu-devel@nongnu.org; Tue, 08 Nov 2011 10:32:51 -0500 Received: by ywb3 with SMTP id 3so727649ywb.4 for ; Tue, 08 Nov 2011 07:32:50 -0800 (PST) Message-ID: <4EB94B9F.5040102@codemonkey.ws> Date: Tue, 08 Nov 2011 09:32:47 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <1319540983-4248-1-git-send-email-benoit.canet@gmail.com> <1319540983-4248-5-git-send-email-benoit.canet@gmail.com> <4EB8CD52.1000008@redhat.com> <4EB91EDE.7070909@redhat.com> <4EB922DD.6040309@redhat.com> <4EB933BE.7080503@codemonkey.ws> <4EB93ED9.6060105@redhat.com> <4EB944EE.9090304@codemonkey.ws> <4EB9477D.5010804@redhat.com> In-Reply-To: <4EB9477D.5010804@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/5] integratorcp: convert integratorcm to VMState List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Peter Maydell , =?UTF-8?B?QmVub8OudCBDYW5ldA==?= , qemu-devel@nongnu.org, quintela@redhat.com On 11/08/2011 09:15 AM, Avi Kivity wrote: > On 11/08/2011 05:04 PM, Anthony Liguori wrote: >>> What state is that? Some devices have fixed size, offset, parent, and >>> enable/disable state (is there a word for that?), so there is no state >>> that needs to be transferred. For other devices this is all dynamic. >> >> Any mutable state should be save/restored. Immutable state doesn't >> need to be saved as it's created as part of the device model. > > The memory API doesn't know which fields are mutable and which are not. Right, but sending immutable fields is just wasteful, it's not functionally incorrect. >> >> If the question is, how do we restore the immutable state, that should >> be happening as part of device creation, no? >> >>> The way I see it, we create a link between some device state (a >>> register) and a memory API field (like the offset). This way, when one >>> changes, so does the other. In complicated devices we'll have to write >>> a callback. >> >> In devices where we dynamically change the offset (it's mutable), we >> should save the offset and restore it. Since offset is sometimes >> mutable and sometimes immutable, we should always save/restore it. In >> the cases where it's really immutable, since the value isn't changing, >> there's no harm in doing save/restore. > > There is, you're taking an implementation detail and making it into an > ABI. Change the implementation and migration breaks. Yes, that's a feature, not a bug. If we send too little state today in version X, then discover this while working on version X + 1, we have no recourse. We have to black list version X. Discovering this is hard because we have to find a symptom of broken migration. This is often subtle like, "if you migrate while a floppy request is in flight, the request is lost resulting in a timeout in the guest kernel". If we send too much state (internal implementation that is derived from something else) in version X, then discover this while working on version X + 1, we can filter the incoming state in X + 1 to just ignore the extra state and derive the correct internal state from the other stable registers. Discovering cases like this is easy because migration fails directly--not indirectly through a functional regression. That means this is something we can very easily catch in regression testing. I actually think this is the way to do it too. Save/restore everything by default and then as we develop and discover migration breaks, add filtering in the new versions to ignore and not send internal state. I don't think there's a tremendous amount of value is proactively filtering internal state. A lot of internal state never changes over a long period of time. >> Yes, we could save just the device register, and use a callback to >> regenerate the offset. But that adds complexity and leads to more >> save/restore bugs. >> >> We shouldn't be reluctant to save/restore derived state. Whether we >> send it over the wire is a different story. We should start by saving >> as much state as we need to, and then sit down and start removing >> state and adding callbacks as we need to. > > "saving state without sending it over the wire" is another way of saying > "not saving state". Or filtering it on the receiving end. That's the fundamental difference. >> Why? The only thing that removing it does is create additional >> complexity for save/restore. You may argue that sending minimal state >> improves migration compatibility but I think the current state of >> save/restore is an existence proof that this line of reasoning is >> incorrect. > > It doesn't create additional complexity for save restore, and I don't > think that the current state of save/restore proves anything except that > it needs a lot more work. It's very hard to do the style of save/restore that we do correctly. Regards, Anthony Liguori