All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, amit.shah@redhat.com, aarcange@redhat.com
Subject: Re: [Qemu-devel] [PATCH 02/15] postcopy: Transmit ram size summary word
Date: Wed, 25 Jan 2017 16:18:23 +0000	[thread overview]
Message-ID: <20170125161823.GL2096@work-vm> (raw)
In-Reply-To: <87wpdj79e9.fsf@emacs.mitica>

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > Replace the host page-size in the 'advise' command by a pagesize
> > summary bitmap; if the VM is just using normal RAM then
> > this will be exactly the same as before, however if they're using
> > huge pages they'll be different, and thus:
> >    a) Migration from/to old qemu's that don't understand huge pages
> >       will fail early.
> >    b) Migrations with different size RAMBlocks will also fail early.
> >
> > The previous block size check during RAM block transmission will also
> > catch these cases, but this catches a good summary and should be
> > clearer on a version mismatch.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  include/migration/migration.h |  1 +
> >  migration/ram.c               | 17 +++++++++++++++++
> >  migration/savevm.c            | 32 +++++++++++++++++++++-----------
> >  3 files changed, 39 insertions(+), 11 deletions(-)
> 
> 
> Hi
> 
> Why it is not enough with the previous patch where we compare than blocks
> have the same page size in both ends?

They catch it a bit late, especially since we add extra data to the migration
stream; the 'advise' happens nice and early so catches it immediately.

> I assume that issuing an madvise for block with the correct page_size
> would be enough, no?

Unfortunately not; you can't explicitly switch pagesize via madvise,
and unfortunately the hugetlbfs stuff doesn't do madvise DONTNEED, so you
have to use an fallocate to do that job for that block.

Dave

> Or I am missing something obvious?
> 
> Thanks, JUan.
> 
> >
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index c309d23..0188bcf 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
> > @@ -355,6 +355,7 @@ void global_state_store_running(void);
> >  void flush_page_queue(MigrationState *ms);
> >  int ram_save_queue_pages(MigrationState *ms, const char *rbname,
> >                           ram_addr_t start, ram_addr_t len);
> > +uint64_t ram_pagesize_summary(void);
> >  
> >  PostcopyState postcopy_state_get(void);
> >  /* Set the state and return the old state */
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 39998f5..40fe888 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -608,6 +608,23 @@ static void migration_bitmap_sync_init(void)
> >      iterations_prev = 0;
> >  }
> >  
> > +/* Returns a summary bitmap of the page sizes of all RAMBlocks;
> > + * for VMs with just normal pages this is equivalent to the
> > + * host page size.  If it's got some huge pages then it's the OR
> > + * of all the different page sizes.
> > + */
> > +uint64_t ram_pagesize_summary(void)
> > +{
> > +    RAMBlock *block;
> > +    uint64_t summary = 0;
> > +
> > +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> > +        summary |= block->page_size;
> > +    }
> > +
> > +    return summary;
> > +}
> > +
> >  static void migration_bitmap_sync(void)
> >  {
> >      RAMBlock *block;
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 0363372..aaae044 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -828,7 +828,7 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
> >  void qemu_savevm_send_postcopy_advise(QEMUFile *f)
> >  {
> >      uint64_t tmp[2];
> > -    tmp[0] = cpu_to_be64(getpagesize());
> > +    tmp[0] = cpu_to_be64(ram_pagesize_summary());
> >      tmp[1] = cpu_to_be64(1ul << qemu_target_page_bits());
> >  
> >      trace_qemu_savevm_send_postcopy_advise();
> > @@ -1305,7 +1305,7 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> >  static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> >  {
> >      PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
> > -    uint64_t remote_hps, remote_tps;
> > +    uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
> >  
> >      trace_loadvm_postcopy_handle_advise();
> >      if (ps != POSTCOPY_INCOMING_NONE) {
> > @@ -1317,17 +1317,27 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> >          return -1;
> >      }
> >  
> > -    remote_hps = qemu_get_be64(mis->from_src_file);
> > -    if (remote_hps != getpagesize())  {
> > +    remote_pagesize_summary = qemu_get_be64(mis->from_src_file);
> > +    local_pagesize_summary = ram_pagesize_summary();
> > +
> > +    if (remote_pagesize_summary != local_pagesize_summary)  {
> >          /*
> > -         * Some combinations of mismatch are probably possible but it gets
> > -         * a bit more complicated.  In particular we need to place whole
> > -         * host pages on the dest at once, and we need to ensure that we
> > -         * handle dirtying to make sure we never end up sending part of
> > -         * a hostpage on it's own.
> > +         * This detects two potential causes of mismatch:
> > +         *   a) A mismatch in host page sizes
> > +         *      Some combinations of mismatch are probably possible but it gets
> > +         *      a bit more complicated.  In particular we need to place whole
> > +         *      host pages on the dest at once, and we need to ensure that we
> > +         *      handle dirtying to make sure we never end up sending part of
> > +         *      a hostpage on it's own.
> > +         *   b) The use of different huge page sizes on source/destination
> > +         *      a more fine grain test is performed during RAM block migration
> > +         *      but this test here causes a nice early clear failure, and
> > +         *      also fails when passed to an older qemu that doesn't
> > +         *      do huge pages.
> >           */
> > -        error_report("Postcopy needs matching host page sizes (s=%d d=%d)",
> > -                     (int)remote_hps, getpagesize());
> > +        error_report("Postcopy needs matching RAM page sizes (s=%" PRIx64
> > +                                                             " d=%" PRIx64 ")",
> > +                     remote_pagesize_summary, local_pagesize_summary);
> >          return -1;
> >      }
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2017-01-25 16:18 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-06 18:28 [Qemu-devel] [PATCH 00/15] Postcopy: Hugepage support Dr. David Alan Gilbert (git)
2017-01-06 18:28 ` [Qemu-devel] [PATCH 01/15] postcopy: Transmit and compare individual page sizes Dr. David Alan Gilbert (git)
2017-01-25  8:01   ` [Qemu-devel] [01/15] " Alexey Perevalov
2017-01-25 18:38     ` Dr. David Alan Gilbert
2017-01-25  9:47   ` [Qemu-devel] [PATCH 01/15] " Juan Quintela
2017-01-25 16:15     ` Dr. David Alan Gilbert
2017-01-06 18:28 ` [Qemu-devel] [PATCH 02/15] postcopy: Transmit ram size summary word Dr. David Alan Gilbert (git)
2017-01-25  9:53   ` Juan Quintela
2017-01-25 16:18     ` Dr. David Alan Gilbert [this message]
2017-01-06 18:28 ` [Qemu-devel] [PATCH 03/15] postcopy: Chunk discards for hugepages Dr. David Alan Gilbert (git)
2017-01-25 10:44   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 04/15] Fold postcopy_ram_discard_range into ram_discard_range Dr. David Alan Gilbert (git)
2017-01-25 10:08   ` Juan Quintela
2017-01-25 16:43     ` Dr. David Alan Gilbert
2017-01-06 18:28 ` [Qemu-devel] [PATCH 05/15] postcopy: enhance ram_discard_range for hugepages Dr. David Alan Gilbert (git)
2017-01-25 10:14   ` Juan Quintela
2017-01-30 18:49     ` Dr. David Alan Gilbert
2017-01-30 19:19       ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 06/15] postcopy: Record largest page size Dr. David Alan Gilbert (git)
2017-01-25 10:17   ` Juan Quintela
2017-01-30 16:36     ` Dr. David Alan Gilbert
2017-01-30 19:22       ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 07/15] postcopy: Plumb pagesize down into place helpers Dr. David Alan Gilbert (git)
2017-01-25 10:25   ` Juan Quintela
2017-01-27 15:49     ` Dr. David Alan Gilbert
2017-01-06 18:28 ` [Qemu-devel] [PATCH 08/15] postcopy: Use temporary for placing zero huge pages Dr. David Alan Gilbert (git)
2017-01-25 10:29   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 09/15] postcopy: Load huge pages in one go Dr. David Alan Gilbert (git)
2017-01-25 10:31   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 10/15] postcopy: Mask fault addresses to huge page boundary Dr. David Alan Gilbert (git)
2017-01-31 13:20   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 11/15] postcopy: Send whole huge pages Dr. David Alan Gilbert (git)
2017-01-31 13:20   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 12/15] postcopy: Allow hugepages Dr. David Alan Gilbert (git)
2017-01-31 13:21   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 13/15] postcopy: Update userfaultfd.h header Dr. David Alan Gilbert (git)
2017-01-31 13:22   ` Juan Quintela
2017-01-06 18:28 ` [Qemu-devel] [PATCH 14/15] postcopy: Check for userfault+hugepage feature Dr. David Alan Gilbert (git)
2017-01-31 13:24   ` Juan Quintela
2017-01-31 16:20     ` Dr. David Alan Gilbert
2017-01-06 18:28 ` [Qemu-devel] [PATCH 15/15] postcopy: Add doc about hugepages and postcopy Dr. David Alan Gilbert (git)
2017-01-31 13:25   ` Juan Quintela
2017-01-06 18:51 ` [Qemu-devel] [PATCH 00/15] Postcopy: Hugepage support no-reply
2017-01-06 18:59   ` Dr. David Alan Gilbert
2017-01-09  0:55     ` Fam Zheng
2017-01-09  9:03       ` Dr. David Alan Gilbert
2017-01-06 19:02 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170125161823.GL2096@work-vm \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.