From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57381) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCopV-0001cu-1m for qemu-devel@nongnu.org; Wed, 08 Jul 2015 08:52:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZCopR-00078f-RO for qemu-devel@nongnu.org; Wed, 08 Jul 2015 08:52:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60244) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCopR-00078W-KF for qemu-devel@nongnu.org; Wed, 08 Jul 2015 08:51:57 -0400 From: Juan Quintela In-Reply-To: <559D18B9.4030800@de.ibm.com> (Christian Borntraeger's message of "Wed, 08 Jul 2015 14:34:01 +0200") References: <1436274549-28826-1-git-send-email-quintela@redhat.com> <1436274549-28826-16-git-send-email-quintela@redhat.com> <559CF767.3060000@de.ibm.com> <87a8v6onqi.fsf@neno.neno> <559D14BD.6000809@de.ibm.com> <87615uomyv.fsf@neno.neno> <559D18B9.4030800@de.ibm.com> Date: Wed, 08 Jul 2015 14:51:55 +0200 Message-ID: <871tgiolqs.fsf@neno.neno> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PULL 15/28] migration: create new section to store global state Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Christian Borntraeger Cc: amit.shah@redhat.com, Cornelia Huck , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" Christian Borntraeger wrote: > Am 08.07.2015 um 14:25 schrieb Juan Quintela: >> Christian Borntraeger wrote: >>> Am 08.07.2015 um 14:08 schrieb Juan Quintela: >>>> Christian Borntraeger wrote: >>>>> Am 07.07.2015 um 15:08 schrieb Juan Quintela: >>>>>> This includes a new section that for now just stores the current qemu state. >>>>>> >>>>>> Right now, there are only one way to control what is the state of the >>>>>> target after migration. >>>>>> >>>>>> - If you run the target qemu with -S, it would start stopped. >>>>>> - If you run the target qemu without -S, it would run just after >>>>>> migration finishes. >>>>>> >>>>>> The problem here is what happens if we start the target without -S and >>>>>> there happens one error during migration that puts current state as >>>>>> -EIO. Migration would ends (notice that the error happend doing block >>>>>> IO, network IO, i.e. nothing related with migration), and when >>>>>> migration finish, we would just "continue" running on destination, >>>>>> probably hanging the guest/corruption data, whatever. >>>>>> >>>>>> Signed-off-by: Juan Quintela >>>>>> Reviewed-by: Dr. David Alan Gilbert >>>>> >>>>> This is bisected to cause a regression on s390. >>>>> >>>>> A guest restarts (booting) after managedsave/start instead of continuing. >>>>> >>>>> Do you have any idea what might be wrong? >>>> >>>> Can you check the new patch series that I sent. There is a fix that >>>> *could* help there. *could* because I don't fully understand why it can >>>> give you problems (and even only sometimes). Current guess is that some >>>> of the devices are testing the guest state on LOAD, so that patch. >>>> >>>> Please, test. >>> >>> That patch does indeed fix my problem. >>> I can see that virtio_init uses the runstate to set vm_running of the vdev. This >>> is used in virtio-net for several aspects. >>> But I really dont understand why this causes the symptoms. >>> So I am tempted to add >>> a >>> Tested-by: Christian Borntraeger >>> >>> but I have a bad feeling on the "why" :-/ >> >> The other reason of that patch is that it removes the need that the >> global_state is the last migration section. > > Hmm, we have some register_savevm here and there, but these seem > to be called in device realize/init or machine init. > > >> Could it be that you are >> adding sections after you launch qemu? Or that virtio devices are >> generated later on s390 for any reason? > > Not that I am aware of. There can be hotplug of course, but I dont see > why libvirt should do that during migration. It can also happens if it plugs the devices after we have registerd the global save state. On x86_64 that don't happens, but not sure how things are structured in s390. If it was that, it has been fixed with the patch that I sent. > > Still looking..... Thanks, Juan.