From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:59374) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVAtp-0003Uk-O2 for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:50:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TVAto-0003lV-BD for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:50:45 -0500 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:56033) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVAtn-0003kv-LU for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:50:44 -0500 Received: from /spool/local by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 5 Nov 2012 10:47:29 +1000 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qA50eEPP42991780 for ; Mon, 5 Nov 2012 11:40:15 +1100 Received: from d23av03.au.ibm.com (loopback [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qA50oPjd019119 for ; Mon, 5 Nov 2012 11:50:25 +1100 Date: Mon, 5 Nov 2012 11:17:00 +1100 From: David Gibson Message-ID: <20121105001700.GV27695@truffula.fritz.box> References: <1351654988-13165-1-git-send-email-david@gibson.dropbear.id.au> <509106A0.7020400@redhat.com> <20121102031508.GN27695@truffula.fritz.box> <87625oi84n.fsf@elfo.mitica> <20121103150054.GT27695@truffula.fritz.box> <87d2ztfa9i.fsf@elfo.mitica> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87d2ztfa9i.fsf@elfo.mitica> Subject: Re: [Qemu-devel] [PATCH] Fix off-by-1 error in RAM migration code List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: Orit Wasserman , aliguori@us.ibm.com, qemu-devel@nongnu.org On Sun, Nov 04, 2012 at 08:17:29PM +0100, Juan Quintela wrote: > David Gibson wrote: > > On Fri, Nov 02, 2012 at 11:58:32AM +0100, Juan Quintela wrote: > >> David Gibson wrote: > >> > On Wed, Oct 31, 2012 at 01:08:16PM +0200, Orit Wasserman wrote: > >> >> On 10/31/2012 05:43 AM, David Gibson wrote: > >> > >> Reviewed-by: Juan Quintela > >> > >> Good catch, I missunderstood the function when fixing a different bug, > >> and never undrestood why it fixed it. > > > > Actually.. it just occurred to me that I think there has to be another > > bug here somewhere.. > > I am at KVM Forum and LinuxCon for this week, so I can't test anything. > > For some reason, I missunderstood bitmap_set() and though this was the > value that we "initiliazed" the bitmap with. So, I changed from 0 to 1, > and ..... I was sending half the pages over the wire. Yes, that is > right, just half of them. So clearly we have some bug somewhere > else :-() Ah. > > I haven't actually observed any effects from the memory corruption - > > though it's certainly a real bug. I found this because another effect > > of this bug is that migration_dirty_pages count was set to 1 more than > > the actual number of dirty bits in the bitmap. That meant the dirty > > pages count was never reaching zero and so the migration/savevm never > > terminated. > > I wonder what is on page 0 on an x86, problably some BIOS data that > never changes? No clue about pseries. On pseries it will be exception vectors. So it will change on the firmware->kernel transition. And I think there may be a very few special variables in there, so it will change later. Note that the migration_dirty_pages count will remain mismatched, whether or not page 0 gets touched. > > Except.. that every so often the migration *did* terminate (maybe 1 > > time in 5). Also I kind of hope somebody would have noticed this > > earlier if migrations never terminated on x86 too. But as far as I > > can tell, if initially mismatched like this it ought to be impossible > > for the dirty page count to ever reach zero. Which suggests there is > > another bug with the dirty count tracking :(. > > We use the dirty bitmap count to know how many pages are dirty, but once > that the number is low enough, we just sent "the rest" of the pages. > So, it would always converge (or not) independent of that bug by one. > We never test for zero dirty pages, we test "are we able to send this > many pages" over "max_downtime". So this explains why it works for you > sometimes. Hrm. It explains why it works sometimes (phew the bug I though must be there needn't be). It doesn't explain why it mostly didn't - I actually saw it spinning there on ram_save_iterate(), not completing with one dirty page remaining. But t > > It's possible the memory corruption could account for this, of course > > - since that in theory at least, could have almost any strange effect > > on the program's behavior. But that doesn't seem particularly likely > > to me. > > This depends on _what_ is on page zero, if that is differente for > whatever we put there during boot, and it we ever wrote to that page > again, we would mark that pge dirty anyways, so I would put that > "corruption" problem as highly improbable. Not that we shouldn't fix > the bug, but I doubt that you are getting memory corruption due to this > bug. > > The only way that you can get memory corruption is if you write to that > page just before you do migration, and then you never wrote to it again. > What is on hardware page zero on pseries? or it is just a normal page? You misunderstand. I wasn't talking about potential corruption of guest memory because we fail to transmit page 0 (though that is also a bug). I'm talking about corruption of qemu internal memory because the bitmap_set() writes one 1 bit beyond the end of the space allocated for the migration bitmap - it has no kind of bounds checking. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson