From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32859) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f1VSj-0002zC-NE for qemu-devel@nongnu.org; Thu, 29 Mar 2018 07:11:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f1VSf-00046E-C1 for qemu-devel@nongnu.org; Thu, 29 Mar 2018 07:11:21 -0400 Date: Thu, 29 Mar 2018 12:11:00 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20180329111059.GD2982@work-vm> References: <1521530809-11780-1-git-send-email-zhaoshenglong@huawei.com> <1521530809-11780-3-git-send-email-zhaoshenglong@huawei.com> <5AB0F254.3050503@huawei.com> <5AB21130.2020309@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [Qemu-arm] [PATCH v2 2/2] arm_gicv3_kvm: kvm_dist_get/put: skip the registers banked by GICR List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: Shannon Zhao , qemu-arm , QEMU Developers , Eric Auger , Juan Quintela * Peter Maydell (peter.maydell@linaro.org) wrote: > On 23 March 2018 at 12:08, Peter Maydell wrote: > > On 21 March 2018 at 08:00, Shannon Zhao wrote: > >> On 2018/3/20 19:54, Peter Maydell wrote: > >>> Can you still successfully migrate a VM from a QEMU version > >>> without this bugfix to one with the bugfix ? > >>> > >> I've tested this case. I can migrate a VM between these two versions. > > > > Hmm. Looking at the code I can't see how that would work, > > except by accident. Let me see if I understand what's happening > > here: > > > > In the code in master, we have QEMU data structures > > (bitmaps, etc) which have one entry for each of GICV3_MAXIRQ > > irqs. That includes the RAZ/WI unused space for the SPIs/PPIs, so > > for a 1-bit-per-irq bitmap: > > [0x00000000, irq 32, irq 33, .... ] > > > > When we fill in the values from KVM into these data structures, > > we start after the unused space, because the for_each_dist_irq_reg() > > macro starts with _irq = GIC_INTERNAL. But we forgot to adjust > > the offset value we use for the KVM access, so we start by > > reading the RAZ/WI values from KVM, and the data structure > > contents end up with: > > [0x00000000, 0x00000000, irq 32, irq 33, ... ] > > (and the last irqs wouldn't get transferred). > > > > With this change to the code we will get the offset right and > > the data structure will be filled as > > [0x00000000, irq 32, irq 33, .... ] > > > > But for migration from the old version, the data structure > > we receive from the migration source will contain the old > > broken layout of > > [0x00000000, 0x00000000, irq 32, irq 33, ... ] > > so if the new code doesn't do anything special to handle > > migration from that old version then it will write zeroes to > > irq 32..63, and then write incorrect values for all the irqs > > after that, won't it? > > > > That suggests to me that we need to have some code in the > > migration post-load routine that identifies that the data > > is coming from an old version with this bug, and shifts > > all the data down in the arrays so that the code to write > > it to the kernel can handle it. > > I was thinking a bit more about how to handle this, and > my best idea was: > > (1) send something in the migration stream that says > "I don't have this bug" (version number change? > vmstate field that's just a "no bug" flag? subsection > with no contents?) > > (2) on the destination, if the source doesn't tell us > it doesn't have this bug, and we are running KVM, then > shift all the data in the arrays down to fix it up > [Strictly what we want to know is if the source is > running KVM, not if the destination is, but I don't > know of a way to find that out, and in practice TCG->KVM > migrations don't work anyway, so it's not a big deal.] > > Juan, David, do you have any suggestions for the best > mechanism for part 1; or is there some clever way to > handle this sort of bug that I've missed? The subsection is probably the best bet; unless that is you can find a bit to misuse in an existing field. Dave > thanks > -- PMM -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK