Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Cédric Le Goater" <clg@kaod.org>, jiangshanlai@gmail.com
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	alex.williamson@redhat.com,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device
Date: Wed, 11 Apr 2018 20:21:37 +0100	[thread overview]
Message-ID: <20180411192136.GL2667@work-vm> (raw)
In-Reply-To: <20180411172014.24711-1-clg@kaod.org>

* Cédric Le Goater (clg@kaod.org) wrote:
> Here is some context for this strange change request.
> 
> On the POWER9 processor, the XIVE interrupt controller can control
> interrupt sources using MMIO to trigger events, to EOI or to turn off
> the sources. Priority management and interrupt acknowledgment is also
> controlled by MMIO in the presenter subengine.
> 
> These MMIO regions are exposed to guests in QEMU with a set of 'ram
> device' memory mappings, similarly to VFIO, and the VMAs are populated
> dynamically with the appropriate pages using a fault handler.
> 
> But, these regions are an issue for migration. We need to discard the
> associated RAMBlocks from the RAM state on the source VM and let the
> destination VM rebuild the memory mappings on the new host in the
> post_load() operation just before resuming the system.
> 
> This is the goal of the following proposal. Does it make sense ? It
> seems to be working enough to migrate a running guest but there might
> be a better, more subtle, approach.

If this is always true of RAM devices (which I suspect it is).

Interestingly, your patch comes less than 2 weeks after Lai Jiangshan's
 'add capability to bypass the shared memory'
   https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html

which is the only other case I think we've got of someone trying to
avoid transmitting a block.

We should try and merge the two sets to make them consistent; you've
covered some more cases (the other patch wasn't expected to work with
Postcopy anyway).
(At this rate then we can expect another 20 for the year....)

We should probably have:
   1) A     bool is_migratable_block(RAMBlock *)
   2) A RAMBLOCK_FOREACH_MIGRATABLE(block)  macro that is like
RAMBLOCK_FOREACH but does the call to is_migratable_block

then the changes should be mostly pretty tidy.

A sanity check is probably needed on load as well, to give a neat
error if for some reason the source transmits pages to you.

One other thing I notice is your code changes ram_bytes_total(),
where as the other patch avoids it;  I think your code is actually
more correct.

Is there *any* case in existing QEMUs where we migrate ram devices
succesfully, if so we've got to make it backwards compatible; but I
think you're saying there isn't.

Dave


> Thanks,
> 
> C.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  migration/ram.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 0e90efa09236..6404ccd046d8 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -780,6 +780,10 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
>      unsigned long *bitmap = rb->bmap;
>      unsigned long next;
>  
> +    if (memory_region_is_ram_device(rb->mr)) {
> +        return size;
> +    }
> +
>      if (rs->ram_bulk_stage && start > 0) {
>          next = start + 1;
>      } else {
> @@ -826,6 +830,9 @@ uint64_t ram_pagesize_summary(void)
>      uint64_t summary = 0;
>  
>      RAMBLOCK_FOREACH(block) {
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          summary |= block->page_size;
>      }
>  
> @@ -850,6 +857,9 @@ static void migration_bitmap_sync(RAMState *rs)
>      qemu_mutex_lock(&rs->bitmap_mutex);
>      rcu_read_lock();
>      RAMBLOCK_FOREACH(block) {
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          migration_bitmap_sync_range(rs, block, 0, block->used_length);
>      }
>      rcu_read_unlock();
> @@ -1499,6 +1509,10 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>      size_t pagesize_bits =
>          qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
>  
> +    if (memory_region_is_ram_device(pss->block->mr)) {
> +        return 0;
> +    }
> +

Now we shouldn't actually end up here should we - so I suggest an
error_report and returning -EINVAL.

>      do {
>          tmppages = ram_save_target_page(rs, pss, last_stage);
>          if (tmppages < 0) {
> @@ -1588,6 +1602,9 @@ uint64_t ram_bytes_total(void)
>  
>      rcu_read_lock();
>      RAMBLOCK_FOREACH(block) {
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          total += block->used_length;
>      }
>      rcu_read_unlock();
> @@ -1643,6 +1660,9 @@ static void ram_save_cleanup(void *opaque)
>      memory_global_dirty_log_stop();
>  
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          g_free(block->bmap);
>          block->bmap = NULL;
>          g_free(block->unsentmap);
> @@ -1710,6 +1730,9 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms)
>          unsigned long range = block->used_length >> TARGET_PAGE_BITS;
>          unsigned long run_start = find_next_zero_bit(bitmap, range, 0);
>  
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          while (run_start < range) {
>              unsigned long run_end = find_next_bit(bitmap, range, run_start + 1);
>              ram_discard_range(block->idstr, run_start << TARGET_PAGE_BITS,
> @@ -1784,8 +1807,13 @@ static int postcopy_each_ram_send_discard(MigrationState *ms)
>      int ret;
>  
>      RAMBLOCK_FOREACH(block) {
> -        PostcopyDiscardState *pds =
> -            postcopy_discard_send_init(ms, block->idstr);
> +        PostcopyDiscardState *pds;
> +
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
> +
> +        pds = postcopy_discard_send_init(ms, block->idstr);
>  
>          /*
>           * Postcopy sends chunks of bitmap over the wire, but it
> @@ -1996,6 +2024,10 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>          unsigned long *bitmap = block->bmap;
>          unsigned long *unsentmap = block->unsentmap;
>  
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
> +
>          if (!unsentmap) {
>              /* We don't have a safe way to resize the sentmap, so
>               * if the bitmap was resized it will be NULL at this
> @@ -2151,6 +2183,9 @@ static void ram_list_init_bitmaps(void)
>      /* Skip setting bitmap if there is no RAM */
>      if (ram_bytes_total()) {
>          QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +            if (memory_region_is_ram_device(block->mr)) {
> +                continue;
> +            }
>              pages = block->max_length >> TARGET_PAGE_BITS;
>              block->bmap = bitmap_new(pages);
>              bitmap_set(block->bmap, 0, pages);
> @@ -2227,6 +2262,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      RAMBLOCK_FOREACH(block) {
> +        if (memory_region_is_ram_device(block->mr)) {
> +            continue;
> +        }
>          qemu_put_byte(f, strlen(block->idstr));
>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>          qemu_put_be64(f, block->used_length);
> -- 
> 2.13.6
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2018-04-11 19:21 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-11 17:20 [Qemu-devel] [RFC PATCH] migration: discard RAMBlocks of type ram_device Cédric Le Goater
2018-04-11 17:55 ` Alex Williamson
2018-04-11 19:04   ` Kirti Wankhede
2018-04-12 15:59   ` Zhang, Yulei
2018-04-12 16:23     ` Alex Williamson
2018-04-13  6:00       ` Kirti Wankhede
2018-04-11 19:21 ` Dr. David Alan Gilbert [this message]
2018-04-12  7:02   ` Cédric Le Goater
2018-04-12  8:11     ` Dr. David Alan Gilbert
2018-04-12  9:03     ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180411192136.GL2667@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=jiangshanlai@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.