All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: quintela@redhat.com, pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v0 7/7] migration: add background snapshotting
Date: Thu, 12 Jul 2018 19:59:43 +0100	[thread overview]
Message-ID: <20180712185942.GF2610@work-vm> (raw)
In-Reply-To: <20180629080320.320144-8-dplotnikov@virtuozzo.com>

* Denis Plotnikov (dplotnikov@virtuozzo.com) wrote:
> The patch enables to save vmstate to a migration thread
> in the background: ram is being saved while vCPUs are running.
> This is done to reduce downtime on vm snapshotting: the majority
> of vmstate is ram, the rest of devices consumes only a few MB of
> memory on a typical vm.
> By this moment, there were no means to make a snapshot without
> full vm stopping. From now, one can save a vm state having the vm
> stopped only for a couple of seconds to save devices' states. Then
> the vm resumed and ram is written without the vm full stopping.
> 
> To use async vm state it's needed to enable "background-snapshot"
> migration capability and start the migration.
> 
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> ---
>  migration/migration.c | 104 +++++++++++++++++++++++++++++++++++++++++-
>  migration/savevm.c    |  91 +++++++++++++++++++-----------------
>  migration/savevm.h    |   2 +
>  3 files changed, 154 insertions(+), 43 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 131d0904e4..b062bd8ddc 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2188,6 +2188,100 @@ bool migrate_colo_enabled(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_X_COLO];
>  }
>  
> +static void background_snapshot_sigsegv_handler(int unused0, siginfo_t *info, void *unused1)
> +{
> +    if (ram_process_page_fault(info->si_addr)) {
> +        assert(false);
> +    }
> +}
> +
> +static void *background_snapshot_thread(void *opaque)
> +{
> +    MigrationState *m = opaque;
> +    QIOChannelBuffer *bioc;
> +    QEMUFile *fb;
> +    RamBlockList *ram_blocks;
> +    int res = 0;
> +
> +    rcu_register_thread();
> +
> +    qemu_file_set_rate_limit(m->to_dst_file, INT64_MAX);
> +
> +    qemu_mutex_lock_iothread();
> +    vm_stop(RUN_STATE_PAUSED);
> +
> +    qemu_savevm_state_header(m->to_dst_file);
> +    qemu_mutex_unlock_iothread();
> +    qemu_savevm_state_setup(m->to_dst_file);
> +    qemu_mutex_lock_iothread();

That's curiously different locking from the ones in migration.c and
savevm.c (qemu_savevm_state)

> +    migrate_set_state(&m->state, MIGRATION_STATUS_SETUP,
> +                      MIGRATION_STATUS_ACTIVE);
> +
> +    // save the device state to the temporary buffer
> +    bioc = qio_channel_buffer_new(4096);
> +    qio_channel_set_name(QIO_CHANNEL(bioc), "vmstate-buffer");
> +    fb = qemu_fopen_channel_output(QIO_CHANNEL(bioc));
> +    object_unref(OBJECT(bioc));
> +
> +    ram_blocks = ram_blocks_get();
> +    ram_blocks_fill(ram_blocks);
> +    ram_blocks_set_ro(ram_blocks);
> +
> +    if(global_state_store()) {
> +        goto exit;
> +    }
> +
> +    if (qemu_save_all_device_states(fb, false, true) < 0) {
> +        migrate_set_state(&m->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        goto exit;
> +    }
> +
> +    sigsegv_user_handler_set(background_snapshot_sigsegv_handler);

That's actually a good test if you're write protecting the RAM before
saving the device state; they *shouldn't* write to ram doing that
but I don't trust them.

> +    smp_mb();

Comment why that mb is there?

> +    vm_start();
> +    qemu_mutex_unlock_iothread();
> +
> +    while (!res) {
> +        res = qemu_savevm_state_iterate(m->to_dst_file, false);

I'm surprised it goes around this loop much, given that you've
set hte bandwidth to max and I assume you never see dirty pages.

> +        if (res < 0 || qemu_file_get_error(m->to_dst_file)) {
> +            migrate_set_state(&m->state, MIGRATION_STATUS_ACTIVE,
> +                              MIGRATION_STATUS_FAILED);
> +            goto exit;
> +        }
> +    }
> +
> +    // By this moment we have RAM saved into the stream
> +    // The next step is to flush the device state right after the
> +    // RAM saved. The rest of device state is stored in
> +    // the temporary buffer. Let's flush the buffer into the stream.
> +    qemu_put_buffer(m->to_dst_file, bioc->data, bioc->usage);
> +    qemu_fflush(m->to_dst_file);
> +
> +    if (qemu_file_get_error(m->to_dst_file)) {
> +        migrate_set_state(&m->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_FAILED);
> +        goto exit;
> +    }
> +
> +    migrate_set_state(&m->state, MIGRATION_STATUS_ACTIVE,
> +                                 MIGRATION_STATUS_COMPLETED);
> +exit:
> +    ram_blocks_set_rw(ram_blocks);
> +    ram_blocks_clear(ram_blocks);
> +    smp_mb();
> +    sigsegv_user_handler_reset();
> +    qemu_fclose(fb);
> +    qemu_mutex_lock_iothread();
> +    qemu_savevm_state_cleanup();
> +    qemu_mutex_unlock_iothread();
> +    rcu_unregister_thread();
> +    return NULL;
> +}
> +
>  /*
>   * Master migration thread on the source VM.
>   * It drives the migration and pumps the data down the outgoing channel.
> @@ -2402,8 +2496,14 @@ void migrate_fd_connect(MigrationState *s)
>          migrate_fd_cleanup(s);
>          return;
>      }
> -    qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
> -                       QEMU_THREAD_JOINABLE);
> +    if (migrate_background_snapshot()) {
> +        qemu_thread_create(&s->thread, "bg_snapshot",
> +                           background_snapshot_thread, s,
> +                           QEMU_THREAD_JOINABLE);
> +    } else {
> +        qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
> +                           QEMU_THREAD_JOINABLE);
> +    }
>      s->migration_thread_running = true;
>  }
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index f202c3de3a..641b897ff5 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1107,51 +1107,15 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
>      qemu_fflush(f);
>  }
>  
> -int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> -                                       bool inactivate_disks)
> +int qemu_save_all_device_states(QEMUFile *f, bool inactivate_disks,
> +                                bool send_eof)

We've got qemu_save_device_state as well; lots of variations.

Dave

>  {
> -    QJSON *vmdesc;
>      int vmdesc_len;
>      SaveStateEntry *se;
>      int ret;
> -    bool in_postcopy = migration_in_postcopy();
> -
> -    trace_savevm_state_complete_precopy();
> +    QJSON *vmdesc = qjson_new();
>  
>      cpu_synchronize_all_states();
> -
> -    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> -        if (!se->ops ||
> -            (in_postcopy && se->ops->has_postcopy &&
> -             se->ops->has_postcopy(se->opaque)) ||
> -            (in_postcopy && !iterable_only) ||
> -            !se->ops->save_live_complete_precopy) {
> -            continue;
> -        }
> -
> -        if (se->ops && se->ops->is_active) {
> -            if (!se->ops->is_active(se->opaque)) {
> -                continue;
> -            }
> -        }
> -        trace_savevm_section_start(se->idstr, se->section_id);
> -
> -        save_section_header(f, se, QEMU_VM_SECTION_END);
> -
> -        ret = se->ops->save_live_complete_precopy(f, se->opaque);
> -        trace_savevm_section_end(se->idstr, se->section_id, ret);
> -        save_section_footer(f, se);
> -        if (ret < 0) {
> -            qemu_file_set_error(f, ret);
> -            return -1;
> -        }
> -    }
> -
> -    if (iterable_only) {
> -        return 0;
> -    }
> -
> -    vmdesc = qjson_new();
>      json_prop_int(vmdesc, "page_size", qemu_target_page_size());
>      json_start_array(vmdesc, "devices");
>      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> @@ -1193,8 +1157,8 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
>              return ret;
>          }
>      }
> -    if (!in_postcopy) {
> -        /* Postcopy stream will still be going */
> +
> +    if (send_eof) {
>          qemu_put_byte(f, QEMU_VM_EOF);
>      }
>  
> @@ -1213,6 +1177,51 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
>      return 0;
>  }
>  
> +int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> +                                       bool inactivate_disks)
> +{
> +    SaveStateEntry *se;
> +    int ret;
> +    bool in_postcopy = migration_in_postcopy();
> +
> +    trace_savevm_state_complete_precopy();
> +
> +    cpu_synchronize_all_states();
> +
> +    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> +        if (!se->ops ||
> +            (in_postcopy && se->ops->has_postcopy &&
> +             se->ops->has_postcopy(se->opaque)) ||
> +            (in_postcopy && !iterable_only) ||
> +            !se->ops->save_live_complete_precopy) {
> +            continue;
> +        }
> +
> +        if (se->ops && se->ops->is_active) {
> +            if (!se->ops->is_active(se->opaque)) {
> +                continue;
> +            }
> +        }
> +        trace_savevm_section_start(se->idstr, se->section_id);
> +
> +        save_section_header(f, se, QEMU_VM_SECTION_END);
> +
> +        ret = se->ops->save_live_complete_precopy(f, se->opaque);
> +        trace_savevm_section_end(se->idstr, se->section_id, ret);
> +        save_section_footer(f, se);
> +        if (ret < 0) {
> +            qemu_file_set_error(f, ret);
> +            return -1;
> +        }
> +    }
> +
> +    if (iterable_only) {
> +        return 0;
> +    }
> +
> +    return qemu_save_all_device_states(f, inactivate_disks, !in_postcopy);
> +}
> +
>  /* Give an estimate of the amount left to be transferred,
>   * the result is split into the amount for units that can and
>   * for units that can't do postcopy.
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 295c4a1f2c..87b278142d 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -55,4 +55,6 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
>  int qemu_loadvm_state(QEMUFile *f);
>  void qemu_loadvm_state_cleanup(void);
>  
> +int qemu_save_all_device_states(QEMUFile *f, bool inactivate_disks,
> +                                bool send_eof);
>  #endif
> -- 
> 2.17.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-07-12 18:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29  8:03 [Qemu-devel] [PATCH v0 0/7] Background snapshots Denis Plotnikov
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 1/7] migration: add background snapshot capability Denis Plotnikov
2018-06-29 16:02   ` Eric Blake
2018-07-12  9:03   ` Dr. David Alan Gilbert
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 2/7] bitops: add some atomic versions of bitmap operations Denis Plotnikov
2018-07-12  9:21   ` Dr. David Alan Gilbert
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 3/7] threads: add infrastructure to process sigsegv Denis Plotnikov
2018-07-12  9:53   ` Dr. David Alan Gilbert
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 4/7] migration: add background snapshot infrastructure Denis Plotnikov
2018-07-12 11:46   ` Dr. David Alan Gilbert
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 5/7] kvm: add failed memeory access exit reason Denis Plotnikov
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 6/7] kvm: add vCPU failed memeory access processing Denis Plotnikov
2018-06-29  8:03 ` [Qemu-devel] [PATCH v0 7/7] migration: add background snapshotting Denis Plotnikov
2018-07-12 18:59   ` Dr. David Alan Gilbert [this message]
2018-06-29 11:53 ` [Qemu-devel] [PATCH v0 0/7] Background snapshots Dr. David Alan Gilbert
2018-07-25 10:18   ` Peter Xu
2018-07-25 19:17     ` Dr. David Alan Gilbert
2018-07-25 20:04       ` Andrea Arcangeli
2018-07-26  8:51         ` Paolo Bonzini
2018-07-26  9:23           ` Peter Xu
2018-08-13 12:55             ` Denis Plotnikov
2018-08-13 19:00               ` Dr. David Alan Gilbert
2018-08-14  5:45                 ` Peter Xu
2018-08-14  6:13                 ` Mike Rapoport
2018-08-14 23:16                 ` Mike Kravetz
2018-07-26 15:13           ` Dr. David Alan Gilbert
2018-07-02 11:23 ` Peter Xu
2018-07-02 12:40   ` Denis Plotnikov
2018-07-03  5:54     ` Peter Xu
2018-07-13  5:20 ` Peter Xu
2018-07-16 15:00   ` Denis Plotnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180712185942.GF2610@work-vm \
    --to=dgilbert@redhat.com \
    --cc=dplotnikov@virtuozzo.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.