All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: arei.gonglei@huawei.com, qemu-devel@nongnu.org, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH 1/1] Stop reinit of XBZRLE.lock
Date: Wed, 19 Mar 2014 09:31:39 +0000	[thread overview]
Message-ID: <20140319093139.GA2371@work-vm> (raw)
In-Reply-To: <8761na7hl8.fsf@blackfin.pond.sub.org>

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Markus Armbruster (armbru@redhat.com) wrote:
> >> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes:
> >
> > <snip>
> >
> >> > diff --git a/arch_init.c b/arch_init.c
> >> > index 60c975d..16474b5 100644
> >> > --- a/arch_init.c
> >> > +++ b/arch_init.c
> >> > @@ -167,10 +167,13 @@ static struct {
> >> >      /* Cache for XBZRLE, Protected by lock. */
> >> >      PageCache *cache;
> >> >      QemuMutex lock;
> >> > +    bool lock_init; /* True once we've init'd lock */
> >> >  } XBZRLE = {
> >> >      .encoded_buf = NULL,
> >> >      .current_buf = NULL,
> >> >      .cache = NULL,
> >> > +    /* .lock is initialised in ram_save_setup */
> >> > +    .lock_init = false
> >> >  };
> >> 
> >> Redundant initializers.
> >
> > Given how subtle lock stuff is, I'll take making it obvious as more important.
> 
> That a static variable starts out zeroed is plenty obvious.  More
> obvious in fact than a bunch of explicit initializers I actually have to
> read to see what they mean.
> 
> We rely on implicit zero-initialization of static variables all over the
> place.
> 
> >> >  /* buffer used for XBZRLE decoding */
> >> >  static uint8_t *xbzrle_decoded_buf;
> >> > @@ -187,6 +190,11 @@ static void XBZRLE_cache_unlock(void)
> >> >          qemu_mutex_unlock(&XBZRLE.lock);
> >> >  }
> >> >  
> >> > +/* called from qmp_migrate_set_cache_size in main thread, possibly while
> >> > + * a migration is in progress.
> >> > + * A running migration maybe using the cache and might finish during this
> >> 
> >> may be
> >> 
> >> > + * call, hence changes to the cache are protected by XBZRLE.lock().
> >> > + */
> >> 
> >> Style nit, since I'm nitpicking spelling already: our winged comments
> >> usually look like
> >> 
> >> /*
> >>  * Text
> >>  */
> >
> > Oops, yes; if I need to respin I'll fix that (hmm I wonder if the check
> > script could be tweaked to find those).
> >
> >> >  int64_t xbzrle_cache_resize(int64_t new_size)
> >> >  {
> >> >      PageCache *new_cache, *cache_to_free;
> >> > @@ -195,9 +203,12 @@ int64_t xbzrle_cache_resize(int64_t new_size)
> >> >          return -1;
> >> >      }
> >> >  
> >> > -    /* no need to lock, the current thread holds qemu big lock */
> >> > +    /* The current thread holds qemu big lock, and we hold it while creating
> >> > +     * the cache in ram_save_setup, thus it's safe to test if the
> >> > +     * cache exists yet without it's own lock (which might not have been
> >> > +     * init'd yet)
> >> > +     */
> >> >      if (XBZRLE.cache != NULL) {
> >> > -        /* check XBZRLE.cache again later */
> >> >          if (pow2floor(new_size) == migrate_xbzrle_cache_size()) {
> >> >              return pow2floor(new_size);
> >> >          }
> >> > @@ -209,7 +220,10 @@ int64_t xbzrle_cache_resize(int64_t new_size)
> >> >          }
> >> >  
> >> >          XBZRLE_cache_lock();
> >> > -        /* the XBZRLE.cache may have be destroyed, check it again */
> >> > +        /* the migration might have finished between the check above and us
> >> > +         * taking the lock,  causing XBZRLE.cache to be destroyed
> >> > +         *   check it again
> >> > +         */
> >> >          if (XBZRLE.cache != NULL) {
> >> >              cache_to_free = XBZRLE.cache;
> >> >              XBZRLE.cache = new_cache;
> >> > @@ -744,7 +758,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >> >              DPRINTF("Error creating cache\n");
> >> >              return -1;
> >> >          }
> >> > -        qemu_mutex_init(&XBZRLE.lock);
> >> > +        /* mutex's can't be reinit'd without destroying them
> >> > +         * and we've not got a good place to destroy it that
> >> > +         * we can guarantee isn't being called when we might want
> >> > +         * to hold the lock.
> >> > +         */
> >> > +        if (!XBZRLE.lock_init) {
> >> > +            XBZRLE.lock_init = true;
> >> > +            qemu_mutex_init(&XBZRLE.lock);
> >> > +        }
> >> >          qemu_mutex_unlock_iothread();
> >> >  
> >> >          /* We prefer not to abort if there is no memory */
> >> 
> >> I detest how the locking works in xbzrle_cache_resize().
> >
> > Yeh, it's tricky - I think the way to think about it is that the 
> > lock protects the cache and it's contents, not the pointer.
> > Except that's not really true when it swaps the new one in - hmm.
> 
> Yup, and that's what I call confusing and way too subtle.

Yes, agreed - this shouldn't be a hard problem that needs subtlety.

> >> The first XBZRLE.cache != NULL is not under XBZRLE.lock.  As you explain
> >> in the comment, this is required, because we foolishly delay lock
> >> initialization until a migration starts, and it's safe, because cache
> >> creation is also under the BQL.
> >
> > Some of this is down to the interfaces between migration generally,
> > the devices being migrated and the specials for RAM/iterative migration.
> >
> > 1) A lot of the ram migration data, including XBZRLE, is global data in
> > arch_init - this is bad.
> 
> As long as there can only be one migration active at a time, it's not
> really bad, isn't it?

I find it quite messy; there is state associated with migration all
over, I couldn't honestly tell you all the things that need to be
initialised, updated, etc when a migration is in progress.
I'd rather hang stuff off MigrationState.

> > 2) The management of these is generally glued onto the migration
> > setup/iterate/complete set of methods used during migration, however
> > they don't have any calls that correspond to the actual start/end of
> > migration, or the like that could be sanely used to do any init
> > that tied up with the actual start end of migration - this is bad.
> 
> Maybe.  But why delay the initialization of XBZRLE.lock to start of
> migration?  All the other members of XBZRLE are initialized on start of
> program!

That's just fall out from the shock that you couldn't staticly init
the lock, the original patch version statically init'd the lock in the
same way as the other members.

> > 3) Migration is in a separate thread and could finish at any time - this
> > is expected but complicates life, especially when (2) means that
> > the data structures for RAMs migration etc are cleared up when migration
> > finishes.
> 
> Four activities: initialization, cache setup, cache use, cache teardown.
> 
> Initialization can be done during program startup, where we don't have
> to worry about threads and locks.  The other three can then be simply
> put under the lock, and the comment "Protected by lock" becomes actually
> true.  Sketch appended.  It passes "make check", it must be purrfect!

I'm mostly OK with this, with one small change at the bottom.

> > I don't know the history but (1) seems to arrise from the
> > semi-arbitrary split between arch_init.c, memory.c, savevm.c, and probably
> > 3 or 4 other files I've failed to mention.
> 
> [...]
> 
> 
> From 6ab76b05789de80d4e0919404429fadcbc0ea403 Mon Sep 17 00:00:00 2001
> From: Markus Armbruster <armbru@redhat.com>
> Date: Wed, 19 Mar 2014 08:46:51 +0100
> Subject: [PATCH] XBZRLE: Fix and simplify locking of cache WIP
> 
> ---
>  arch_init.c                | 49 ++++++++++++++++++++++------------------------
>  include/sysemu/arch_init.h |  1 +
>  vl.c                       |  1 +
>  3 files changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 60c975d..e41e9dd 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -167,14 +167,16 @@ static struct {
>      /* Cache for XBZRLE, Protected by lock. */
>      PageCache *cache;
>      QemuMutex lock;
> -} XBZRLE = {
> -    .encoded_buf = NULL,
> -    .current_buf = NULL,
> -    .cache = NULL,
> -};
> +} XBZRLE;
> +
>  /* buffer used for XBZRLE decoding */
>  static uint8_t *xbzrle_decoded_buf;
>  
> +void XBZRLE_init(void)
> +{
> +    qemu_mutex_init(&XBZRLE.lock);
> +}
> +
>  static void XBZRLE_cache_lock(void)
>  {
>      if (migrate_use_xbzrle())
> @@ -189,39 +191,35 @@ static void XBZRLE_cache_unlock(void)
>  
>  int64_t xbzrle_cache_resize(int64_t new_size)
>  {
> -    PageCache *new_cache, *cache_to_free;
> +    PageCache *new_cache;
> +    int64_t ret;
>  
>      if (new_size < TARGET_PAGE_SIZE) {
>          return -1;
>      }
>  
> -    /* no need to lock, the current thread holds qemu big lock */
> +    XBZRLE_cache_lock();
> +
>      if (XBZRLE.cache != NULL) {
> -        /* check XBZRLE.cache again later */
>          if (pow2floor(new_size) == migrate_xbzrle_cache_size()) {
> -            return pow2floor(new_size);
> +            goto out_new_size;
>          }
>          new_cache = cache_init(new_size / TARGET_PAGE_SIZE,
>                                          TARGET_PAGE_SIZE);
>          if (!new_cache) {
>              DPRINTF("Error creating cache\n");
> -            return -1;
> +            goto out;
>          }
>  
> -        XBZRLE_cache_lock();
> -        /* the XBZRLE.cache may have be destroyed, check it again */
> -        if (XBZRLE.cache != NULL) {
> -            cache_to_free = XBZRLE.cache;
> -            XBZRLE.cache = new_cache;
> -        } else {
> -            cache_to_free = new_cache;
> -        }
> -        XBZRLE_cache_unlock();
> -
> -        cache_fini(cache_to_free);
> +        cache_fini(XBZRLE.cache);
> +        XBZRLE.cache = new_cache;
>      }
>  
> -    return pow2floor(new_size);
> +out_new_size:
> +    ret = pow2floor(new_size);
> +out:
> +    XBZRLE_cache_unlock();
> +    return ret;
>  }
>  
>  /* accounting for migration statistics */
> @@ -735,17 +733,16 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      dirty_rate_high_cnt = 0;
>  
>      if (migrate_use_xbzrle()) {
> -        qemu_mutex_lock_iothread();
> +        XBZRLE_cache_lock();
>          XBZRLE.cache = cache_init(migrate_xbzrle_cache_size() /
>                                    TARGET_PAGE_SIZE,
>                                    TARGET_PAGE_SIZE);
>          if (!XBZRLE.cache) {
> -            qemu_mutex_unlock_iothread();
> +            XBZRLE_cache_unlock();
>              DPRINTF("Error creating cache\n");
>              return -1;
>          }
> -        qemu_mutex_init(&XBZRLE.lock);
> -        qemu_mutex_unlock_iothread();
> +        XBZRLE_cache_unlock();
>  
>          /* We prefer not to abort if there is no memory */
>          XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE);
> diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h
> index be71bca..fdc3423 100644
> --- a/include/sysemu/arch_init.h
> +++ b/include/sysemu/arch_init.h
> @@ -26,6 +26,7 @@ enum {
>  
>  extern const uint32_t arch_type;
>  
> +void XBZRLE_init(void);
>  void select_soundhw(const char *optarg);
>  void do_acpitable_option(const QemuOpts *opts);
>  void do_smbios_option(QemuOpts *opts);
> diff --git a/vl.c b/vl.c
> index f0fe48b..24433c8 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -3008,6 +3008,7 @@ int main(int argc, char **argv, char **envp)
>  
>      qemu_init_auxval(envp);
>      qemu_cache_utils_init();
> +    XBZRLE_init();
>  
>      QLIST_INIT (&vm_change_state_head);
>      os_setup_early_signal_handling();
> -- 
> 1.8.1.4

I noticed that in main.c there is a call to a blk_mig_init() - so
how about (not even build tested):

diff --git a/vl.c b/vl.c
index 842e897..ab8cbf0 100644
--- a/vl.c
+++ b/vl.c
@@ -4247,6 +4247,7 @@ int main(int argc, char **argv, char **envp)
     cpu_exec_init_all();
 
     blk_mig_init();
+    ram_mig_init();
 
     /* open the virtual block devices */
     if (snapshot)
@@ -4261,8 +4262,6 @@ int main(int argc, char **argv, char **envp)
     default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
     default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
 
-    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL);
-
     if (nb_numa_nodes > 0) {
         int i;
diff --git a/arch_init.c b/arch_init.c
index 16474b5..4f8376d 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -190,6 +190,12 @@ static void XBZRLE_cache_unlock(void)
         qemu_mutex_unlock(&XBZRLE.lock);
 }
 
+void ram_mig_init(void)
+{
+    qemu_mutex_init(&XBZRLE.lock);
+    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL);
+}
+
 /* called from qmp_migrate_set_cache_size in main thread, possibly while
  * a migration is in progress.
  * A running migration maybe using the cache and might finish during this
@@ -1117,7 +1123,7 @@ done:
     return ret;
 }
 
-SaveVMHandlers savevm_ram_handlers = {
+static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
     .save_live_complete = ram_save_complete,

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3e1e6c7..31fbf17 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -113,8 +113,6 @@ void free_xbzrle_decoded_buf(void);                                                     
 
 void acct_update_position(QEMUFile *f, size_t size, bool zero);
 
-extern SaveVMHandlers savevm_ram_handlers;
-
 uint64_t dup_mig_bytes_transferred(void);
 uint64_t dup_mig_pages_transferred(void);
 uint64_t skipped_mig_bytes_transferred(void);

and that way we're simplifying vl.c rather than teaching it even
more about the innards of ram migration.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2014-03-19  9:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-18 15:56 [Qemu-devel] [PATCH 1/1] Stop reinit of XBZRLE.lock Dr. David Alan Gilbert (git)
2014-03-18 16:47 ` 陈梁
2014-03-18 20:20   ` Dr. David Alan Gilbert
2014-03-18 17:24 ` Markus Armbruster
2014-03-18 20:47   ` Dr. David Alan Gilbert
2014-03-19  7:50     ` Markus Armbruster
2014-03-19  9:31       ` Dr. David Alan Gilbert [this message]
2014-03-19 12:07         ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140319093139.GA2371@work-vm \
    --to=dgilbert@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.