Re: [Qemu-devel] [PATCH v2] migration: discard non-migratable RAMBlocks

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Alex Williamson <alex.williamson@redhat.com>,
	Yulei Zhang <yulei.zhang@intel.com>,
	kevin.tian@intel.com, joonas.lahtinen@linux.intel.com,
	zhenyuw@linux.intel.com, kwankhede@nvidia.com,
	zhi.a.wang@intel.com, Peter Maydell <peter.maydell@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2] migration: discard non-migratable RAMBlocks
Date: Wed, 9 May 2018 12:08:34 +0100	[thread overview]
Message-ID: <20180509110833.GD2527@work-vm> (raw)
In-Reply-To: <e5447136-8de7-73f3-ca6d-0c7a1fe11abe@kaod.org>

This thread seems to have stalled; we've got the list of potential
migration breaks that Peter found and only some minor comments on the
actual patch.

I'd like to get it going again sicne as well as Cédric, and Peter's
cases, there's Lai Jiangshan and now Eric Wheeler was asking for
something similar.

Anyone understand the way forward? Do we have to go and fix the
odd board failures first?

Dave

* Cédric Le Goater (clg@kaod.org) wrote:
> Hello David,
> 
> On 04/19/2018 06:58 PM, Dr. David Alan Gilbert wrote:
> > * Cédric Le Goater (clg@kaod.org) wrote:
> >> On the POWER9 processor, the XIVE interrupt controller can control
> >> interrupt sources using MMIO to trigger events, to EOI or to turn off
> >> the sources. Priority management and interrupt acknowledgment is also
> >> controlled by MMIO in the presenter sub-engine.
> >>
> >> These MMIO regions are exposed to guests in QEMU with a set of 'ram
> >> device' memory mappings, similarly to VFIO, and the VMAs are populated
> >> dynamically with the appropriate pages using a fault handler.
> >>
> >> But, these regions are an issue for migration. We need to discard the
> >> associated RAMBlocks from the RAM state on the source VM and let the
> >> destination VM rebuild the memory mappings on the new host in the
> >> post_load() operation just before resuming the system.
> >>
> >> To achieve this goal, the following introduces a new RAMBlock flag
> >> RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
> >> vmstate_unregister_ram() routines. This flag is then used by the
> >> migration to identify RAMBlocks to discard on the source. Some checks
> >> are also performed on the destination to make sure nothing invalid was
> >> sent.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > 
> > Hi Cédric,
> >   Yes, this is looking nicer; the macro makes the changes quite small.
> > A couple of questions:
> > 
> >> ---
> >>
> >>  Changes since v1:
> >>
> >>  - introduced a new RAMBlock flag RAM_MIGRATABLE
> >>  - fixed RAMBLOCK_FOREACH_MIGRATABLE() macro to handle code that
> >>    omitted {}. 
> >>
> >>  exec.c                    | 10 ++++++++++
> >>  include/exec/cpu-common.h |  1 +
> >>  migration/ram.c           | 42 ++++++++++++++++++++++++++++++++----------
> >>  3 files changed, 43 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/exec.c b/exec.c
> >> index 02b1efebb7c3..e9432c06294e 100644
> >> --- a/exec.c
> >> +++ b/exec.c
> >> @@ -104,6 +104,9 @@ static MemoryRegion io_mem_unassigned;
> >>   * (Set during postcopy)
> >>   */
> >>  #define RAM_UF_ZEROPAGE (1 << 3)
> >> +
> >> +/* RAM can be migrated */
> >> +#define RAM_MIGRATABLE (1 << 4)
> >>  #endif
> >>  
> >>  #ifdef TARGET_PAGE_BITS_VARY
> >> @@ -1807,6 +1810,11 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb)
> >>      rb->flags |= RAM_UF_ZEROPAGE;
> >>  }
> >>  
> >> +bool qemu_ram_is_migratable(RAMBlock *rb)
> >> +{
> >> +    return rb->flags & RAM_MIGRATABLE;
> >> +}
> >> +
> >>  /* Called with iothread lock held.  */
> >>  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
> >>  {
> >> @@ -1823,6 +1831,7 @@ void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
> >>          }
> >>      }
> >>      pstrcat(new_block->idstr, sizeof(new_block->idstr), name);
> >> +    new_block->flags |= RAM_MIGRATABLE;
> >>  
> >>      rcu_read_lock();
> >>      RAMBLOCK_FOREACH(block) {
> >> @@ -1845,6 +1854,7 @@ void qemu_ram_unset_idstr(RAMBlock *block)
> >>       */
> >>      if (block) {
> >>          memset(block->idstr, 0, sizeof(block->idstr));
> >> +        block->flags &= ~RAM_MIGRATABLE;
> >>      }
> >>  }
> > 
> > Why in qemu_ram_set_idstr and qemu_ram_(un)set_idstr ?  It seems an
> > odd place to put them.
> 
> The only place where this routines are called is from vmstate_un/register_ram()
> It seemed unnecessary to add an extra interface qemu_ram_un/set_migratable().
> 
> May be should just rename qemu_ram_un/set_idstr() ?
> 
> > 
> >> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> >> index 24d335f95d45..96854519b514 100644
> >> --- a/include/exec/cpu-common.h
> >> +++ b/include/exec/cpu-common.h
> >> @@ -75,6 +75,7 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
> >>  bool qemu_ram_is_shared(RAMBlock *rb);
> >>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
> >>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> >> +bool qemu_ram_is_migratable(RAMBlock *rb);
> >>  
> >>  size_t qemu_ram_pagesize(RAMBlock *block);
> >>  size_t qemu_ram_pagesize_largest(void);
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index 0e90efa09236..89c3accc4e26 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -187,6 +187,11 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr,
> >>                        nr);
> >>  }
> >>  
> >> +/* Should be holding either ram_list.mutex, or the RCU lock. */
> >> +#define RAMBLOCK_FOREACH_MIGRATABLE(block)             \
> >> +    RAMBLOCK_FOREACH(block)                            \
> >> +        if (!qemu_ram_is_migratable(block)) {} else
> >> +
> >>  /*
> >>   * An outstanding page request, on the source, having been received
> >>   * and queued
> >> @@ -780,6 +785,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
> >>      unsigned long *bitmap = rb->bmap;
> >>      unsigned long next;
> >>  
> >> +    if (!qemu_ram_is_migratable(rb)) {
> >> +        return size;
> >> +    }
> >> +
> >>      if (rs->ram_bulk_stage && start > 0) {
> >>          next = start + 1;
> >>      } else {
> >> @@ -825,7 +834,7 @@ uint64_t ram_pagesize_summary(void)
> >>      RAMBlock *block;
> >>      uint64_t summary = 0;
> >>  
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          summary |= block->page_size;
> >>      }
> >>  
> >> @@ -849,7 +858,7 @@ static void migration_bitmap_sync(RAMState *rs)
> >>  
> >>      qemu_mutex_lock(&rs->bitmap_mutex);
> >>      rcu_read_lock();
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
> >>      }
> >>      rcu_read_unlock();
> >> @@ -1499,6 +1508,10 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> >>      size_t pagesize_bits =
> >>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
> >>  
> >> +    if (!qemu_ram_is_migratable(pss->block)) {
> >> +        return 0;
> >> +    }
> >> +
> > 
> > We might want to print a diagnostic there - it shouldn't happen right?
> 
> yes it should not. it should even be fatal. 
> 
> >>      do {
> >>          tmppages = ram_save_target_page(rs, pss, last_stage);
> >>          if (tmppages < 0) {
> >> @@ -1587,7 +1600,7 @@ uint64_t ram_bytes_total(void)
> >>      uint64_t total = 0;
> >>  
> >>      rcu_read_lock();
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          total += block->used_length;
> >>      }
> >>      rcu_read_unlock();
> >> @@ -1642,7 +1655,7 @@ static void ram_save_cleanup(void *opaque)
> >>       */
> >>      memory_global_dirty_log_stop();
> >>  
> >> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          g_free(block->bmap);
> >>          block->bmap = NULL;
> >>          g_free(block->unsentmap);
> >> @@ -1705,7 +1718,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
> >>  {
> >>      struct RAMBlock *block;
> >>  
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          unsigned long *bitmap = block->bmap;
> >>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
> >>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
> >> @@ -1783,7 +1796,7 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
> >>      struct RAMBlock *block;
> >>      int ret;
> >>  
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          PostcopyDiscardState *pds =
> >>              postcopy_discard_send_init(ms, block->idstr);
> >>  
> >> @@ -1991,7 +2004,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
> >>      rs->last_sent_block = NULL;
> >>      rs->last_page = 0;
> >>  
> >> -    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          unsigned long pages = block->used_length >> TARGET_PAGE_BITS;
> >>          unsigned long *bitmap = block->bmap;
> >>          unsigned long *unsentmap = block->unsentmap;
> >> @@ -2150,7 +2163,7 @@ static void ram_list_init_bitmaps(void)
> >>  
> >>      /* Skip setting bitmap if there is no RAM */
> >>      if (ram_bytes_total()) {
> >> -        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >> +        RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>              pages = block->max_length >> TARGET_PAGE_BITS;
> >>              block->bmap = bitmap_new(pages);
> >>              bitmap_set(block->bmap, 0, pages);
> >> @@ -2226,7 +2239,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >>  
> >>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
> >>  
> >> -    RAMBLOCK_FOREACH(block) {
> >> +    RAMBLOCK_FOREACH_MIGRATABLE(block) {
> >>          qemu_put_byte(f, strlen(block->idstr));
> >>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
> >>          qemu_put_be64(f, block->used_length);
> > 
> > Good, that's all quite neat;  we've just got to watch out we don't
> > accidentally add any RAMBLOCK_FOREACH's back.
> 
> Thank,
> 
> C.
> 
> >> @@ -2471,6 +2484,11 @@ static inline RAMBlock *ram_block_from_stream(QEMUFile *f, int flags)
> >>          return NULL;
> >>      }
> >>  
> >> +    if (!qemu_ram_is_migratable(block)) {
> >> +        error_report("block %s should not be migrated !", id);
> >> +        return NULL;
> >> +    }
> >> +
> >>      return block;
> >>  }
> >>  
> >> @@ -2921,7 +2939,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >>                  length = qemu_get_be64(f);
> >>  
> >>                  block = qemu_ram_block_by_name(id);
> >> -                if (block) {
> >> +                if (block && !qemu_ram_is_migratable(block)) {
> >> +                    error_report("block %s should not be migrated !", id);
> >> +                    ret = -EINVAL;
> >> +
> >> +                } else if (block) {
> >>                      if (length != block->used_length) {
> >>                          Error *local_err = NULL;
> >>  
> >> -- 
> >> 2.13.6
> > 
> > Dave
> > 
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2018-05-09 11:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-13  7:52 [Qemu-devel] [PATCH v2] migration: discard non-migratable RAMBlocks Cédric Le Goater
2018-04-13  7:56 ` no-reply
2018-04-13 11:18   ` Juan Quintela
2018-04-13 11:53     ` Peter Maydell
2018-04-19 16:58 ` Dr. David Alan Gilbert
2018-04-20  6:59   ` Cédric Le Goater
2018-04-20  8:47     ` Peter Maydell
2018-04-20  9:05       ` Cédric Le Goater
2018-05-09 11:08     ` Dr. David Alan Gilbert [this message]
2018-05-09 11:33       ` Peter Maydell
2018-05-09 11:54         ` Paolo Bonzini
2018-05-10 19:15         ` Dr. David Alan Gilbert
2018-05-10 21:43           ` Cédric Le Goater
2018-04-19 17:32 ` Peter Maydell
2018-04-19 17:45   ` Edgar E. Iglesias
2018-04-20  8:14     ` KONRAD Frederic
2018-04-20  9:24       ` Edgar E. Iglesias
2018-04-20 12:09 ` Peter Maydell
2018-04-21  8:52   ` Cédric Le Goater
2018-04-21  9:33     ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180509110833.GD2527@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kwankhede@nvidia.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yulei.zhang@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).