qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <dwg@au1.ibm.com>
To: Juan Quintela <quintela@redhat.com>
Cc: Orit Wasserman <owasserm@redhat.com>,
	aliguori@us.ibm.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] Fix off-by-1 error in RAM migration code
Date: Sun, 4 Nov 2012 02:00:54 +1100	[thread overview]
Message-ID: <20121103150054.GT27695@truffula.fritz.box> (raw)
In-Reply-To: <87625oi84n.fsf@elfo.mitica>

On Fri, Nov 02, 2012 at 11:58:32AM +0100, Juan Quintela wrote:
> David Gibson <dwg@au1.ibm.com> wrote:
> > On Wed, Oct 31, 2012 at 01:08:16PM +0200, Orit Wasserman wrote:
> >> On 10/31/2012 05:43 AM, David Gibson wrote:
> >> > The code for migrating (or savevm-ing) memory pages starts off by creating
> >> > a dirty bitmap and filling it with 1s.  Except, actually, because bit
> >> > addresses are 0-based it fills every bit except bit 0 with 1s and puts an
> >> > extra 1 beyond the end of the bitmap, potentially corrupting unrelated
> >> > memory.  Oops.  This patch fixes it.
> >> > 
> >> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >> > ---
> >> >  arch_init.c |    2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> > 
> >> > diff --git a/arch_init.c b/arch_init.c
> >> > index e6effe8..b75a4c5 100644
> >> > --- a/arch_init.c
> >> > +++ b/arch_init.c
> >> > @@ -568,7 +568,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >> >      int64_t ram_pages = last_ram_offset() >> TARGET_PAGE_BITS;
> >> >  
> >> >      migration_bitmap = bitmap_new(ram_pages);
> >> > -    bitmap_set(migration_bitmap, 1, ram_pages);
> >> > +    bitmap_set(migration_bitmap, 0, ram_pages);
> >> >      migration_dirty_pages = ram_pages;
> >> >  
> >> >      bytes_transferred = 0;
> >> > 
> >> You are correct, good catch.
> >> Reviewed-by: Orit Wasserman <owasserm@redhat.com>
> >
> > Juan,
> >
> > Sorry, forgot to CC you on the original mailing here, which I should
> > have done.  This is a serious bug in the migration code and we should
> > apply to mainline ASAP.
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com> 
> 
> Good catch, I missunderstood the function when fixing a different bug,
> and never undrestood why it fixed it.

Actually.. it just occurred to me that I think there has to be another
bug here somewhere..

I haven't actually observed any effects from the memory corruption -
though it's certainly a real bug.  I found this because another effect
of this bug is that migration_dirty_pages count was set to 1 more than
the actual number of dirty bits in the bitmap.  That meant the dirty
pages count was never reaching zero and so the migration/savevm never
terminated.

Except.. that every so often the migration *did* terminate (maybe 1
time in 5).  Also I kind of hope somebody would have noticed this
earlier if migrations never terminated on x86 too.  But as far as I
can tell, if initially mismatched like this it ought to be impossible
for the dirty page count to ever reach zero.  Which suggests there is
another bug with the dirty count tracking :(.

It's possible the memory corruption could account for this, of course
- since that in theory at least, could have almost any strange effect
on the program's behavior.  But that doesn't seem particularly likely
to me.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

  reply	other threads:[~2012-11-03 14:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-31  3:43 [Qemu-devel] [PATCH] Fix off-by-1 error in RAM migration code David Gibson
2012-10-31 11:08 ` Orit Wasserman
2012-11-02  3:15   ` David Gibson
2012-11-02 10:58     ` Juan Quintela
2012-11-03 15:00       ` David Gibson [this message]
2012-11-04 19:17         ` Juan Quintela
2012-11-05  0:17           ` David Gibson
2012-11-21  0:43       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121103150054.GT27695@truffula.fritz.box \
    --to=dwg@au1.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=owasserm@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).