qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: yunhong.jiang@intel.com, eddie.dong@intel.com,
	qemu-devel@nongnu.org, dgilbert@redhat.com,
	Gao feng <gaofeng@cn.fujitsu.com>,
	stefanha@redhat.com, pbonzini@redhat.com,
	peter.huangpeng@huawei.com
Subject: Re: [Qemu-devel] [PATCH RFC v3 24/27] COLO NIC: Implement NIC checkpoint and failover
Date: Thu, 5 Mar 2015 17:12:02 +0000	[thread overview]
Message-ID: <20150305171201.GF2381@work-vm> (raw)
In-Reply-To: <1423711034-5340-25-git-send-email-zhang.zhanghailiang@huawei.com>

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  include/net/colo-nic.h |  3 ++-
>  migration/colo.c       | 22 ++++++++++++++++++----
>  net/colo-nic.c         | 19 +++++++++++++++++++
>  3 files changed, 39 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/colo-nic.h b/include/net/colo-nic.h
> index 67c9807..ddc21cd 100644
> --- a/include/net/colo-nic.h
> +++ b/include/net/colo-nic.h
> @@ -20,5 +20,6 @@ void colo_add_nic_devices(NetClientState *nc);
>  void colo_remove_nic_devices(NetClientState *nc);
>  
>  int colo_proxy_compare(void);
> -
> +int colo_proxy_failover(void);
> +int colo_proxy_checkpoint(void);
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c
> index 579aabf..874971c 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -94,6 +94,11 @@ static void slave_do_failover(void)
>          ;
>      }
>  
> +    if (colo_proxy_failover() != 0) {
> +        error_report("colo proxy failed to do failover");
> +    }
> +    colo_proxy_destroy(COLO_SECONDARY_MODE);

I'm not sure if this is the best thing to do on a secondary failover.
If I understand correctly, when it's running, we have:


-------+
       |                    br0---eth0
       |
 slave +-tun - xt_SECCOLO - br1---eth1
       |
-------+

what I think that colo-proxy-destroy  is doing is rewiring that as:


-------+
       |     +--------------br0---eth0
       |     |
 slave +-tun +              br1---eth1
       |
-------+

but now we've lost the sequence number adjustment data that
was held in xt_SECCOLO and so you are likely to break existing TCP
connections.

Also, I don't think colo-proxy-script is passed a flag to let it
know whether the reason it's doing a slave_uninstall is due to
a failover or a simple shutdown; and so it assumes it has
to do the rewire for a failover.
(Actually the script in the qemu repo is newer than the script in
the colo-proxy repo, that one doesn't have the rewire at all).

Dave

> +
>      colo = NULL;
>  
>      if (!autostart) {
> @@ -115,7 +120,7 @@ static void master_do_failover(void)
>      if (!colo_runstate_is_stopped()) {
>          vm_stop_force_state(RUN_STATE_COLO);
>      }
> -
> +    colo_proxy_destroy(COLO_PRIMARY_MODE);
>      if (s->state != MIG_STATE_ERROR) {
>          migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
>      }
> @@ -245,6 +250,11 @@ static int do_colo_transaction(MigrationState *s, QEMUFile *control)
>  
>      qemu_fflush(trans);
>  
> +    ret = colo_proxy_checkpoint();
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
>      ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
>      if (ret < 0) {
>          goto out;
> @@ -387,8 +397,6 @@ out:
>      qemu_bh_schedule(s->cleanup_bh);
>      qemu_mutex_unlock_iothread();
>  
> -    colo_proxy_destroy(COLO_PRIMARY_MODE);
> -
>      return NULL;
>  }
>  
> @@ -508,6 +516,12 @@ void *colo_process_incoming_checkpoints(void *opaque)
>              goto out;
>          }
>  
> +        ret = colo_proxy_checkpoint();
> +        if (ret < 0) {
> +                goto out;
> +        }
> +        DPRINTF("proxy begin to do checkpoint\n");
> +
>          ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
>          if (ret < 0) {
>              goto out;
> @@ -584,6 +598,7 @@ out:
>          * just kill slave
>          */
>          error_report("SVM is going to exit!");
> +        colo_proxy_destroy(COLO_SECONDARY_MODE);
>          exit(1);
>      } else {
>          /* if we went here, means master may dead, we are doing failover */
> @@ -610,6 +625,5 @@ out:
>  
>      loadvm_exit_colo();
>  
> -    colo_proxy_destroy(COLO_SECONDARY_MODE);
>      return NULL;
>  }
> diff --git a/net/colo-nic.c b/net/colo-nic.c
> index 563d661..02a454d 100644
> --- a/net/colo-nic.c
> +++ b/net/colo-nic.c
> @@ -379,6 +379,25 @@ void colo_proxy_destroy(int side)
>      cp_info.index = -1;
>      colo_nic_side = -1;
>  }
> +
> +int colo_proxy_failover(void)
> +{
> +    if (colo_proxy_send(NULL, 0, COLO_FAILOVER) < 0) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +int colo_proxy_checkpoint(void)
> +{
> +    if (colo_proxy_send(NULL, 0, COLO_CHECKPOINT) < 0) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>  do checkpoint: return 1
>  error: return -1
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2015-03-05 17:12 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-12  3:16 [Qemu-devel] [PATCH RFC v3 00/27] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 01/27] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 02/27] migration: Introduce capability 'colo' to migration zhanghailiang
2015-02-16 21:57   ` Eric Blake
2015-02-25  9:19     ` zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 03/27] COLO: migrate colo related info to slave zhanghailiang
2015-02-16 23:20   ` Eric Blake
2015-02-25  6:21     ` zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 04/27] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-02-16 23:27   ` Eric Blake
2015-02-25  6:43     ` zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 05/27] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 06/27] migration: Don't send vm description in COLO mode zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 07/27] COLO: Implement colo checkpoint protocol zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 08/27] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 09/27] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 10/27] COLO: Save VM state to slave when do checkpoint zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 11/27] COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily zhanghailiang
2015-02-12  3:16 ` [Qemu-devel] [PATCH RFC v3 12/27] COLO VMstate: Load VM state into qsb before restore it zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 13/27] COLO RAM: Flush cached RAM into SVM's memory zhanghailiang
2015-03-11 19:08   ` Dr. David Alan Gilbert
2015-03-12  2:02     ` zhanghailiang
2015-03-12 11:49       ` Dr. David Alan Gilbert
2015-03-11 20:07   ` Dr. David Alan Gilbert
2015-03-12  2:27     ` zhanghailiang
2015-03-12  9:51       ` Dr. David Alan Gilbert
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 14/27] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-02-16 23:47   ` Eric Blake
2015-02-25  7:04     ` zhanghailiang
2015-02-25  7:16       ` Hongyang Yang
2015-02-25  7:40       ` Wen Congyang
2015-03-06 16:10       ` Eric Blake
2015-03-09  1:15         ` zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 15/27] COLO failover: Implement COLO master/slave failover work zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 16/27] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 17/27] COLO: Add new command parameter 'colo_nicname' 'colo_script' for net zhanghailiang
2015-02-16 23:50   ` Eric Blake
2015-02-24  9:50     ` Wen Congyang
2015-02-24 16:30       ` Eric Blake
2015-02-24 17:24         ` Daniel P. Berrange
2015-02-25  8:21           ` zhanghailiang
2015-02-25 10:09             ` Daniel P. Berrange
2015-02-25  7:50     ` zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 18/27] COLO NIC: Init/remove colo nic devices when add/cleanup tap devices zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 19/27] COLO NIC: Implement colo nic device interface configure() zhanghailiang
2015-02-16 12:03   ` Dr. David Alan Gilbert
2015-02-25  3:44     ` zhanghailiang
2015-02-25  9:08       ` Dr. David Alan Gilbert
2015-02-25  9:38         ` zhanghailiang
2015-02-25  9:40           ` Dr. David Alan Gilbert
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 20/27] COLO NIC : Implement colo nic init/destroy function zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 21/27] COLO NIC: Some init work related with proxy module zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 22/27] COLO: Do checkpoint according to the result of net packets comparing zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 23/27] COLO: Improve checkpoint efficiency by do additional periodic checkpoint zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 24/27] COLO NIC: Implement NIC checkpoint and failover zhanghailiang
2015-03-05 17:12   ` Dr. David Alan Gilbert [this message]
2015-03-06  2:35     ` zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 25/27] COLO: Disable qdev hotplug when VM is in COLO mode zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 26/27] COLO: Implement shutdown checkpoint zhanghailiang
2015-02-12  3:17 ` [Qemu-devel] [PATCH RFC v3 27/27] COLO: Add block replication into colo process zhanghailiang
2015-02-16 13:11 ` [Qemu-devel] [PATCH RFC v3 00/27] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Dr. David Alan Gilbert
2015-02-25  5:17   ` Gao feng
2015-02-24 11:08 ` Dr. David Alan Gilbert
2015-02-24 20:13 ` Dr. David Alan Gilbert
2015-02-25  3:20   ` Gao feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150305171201.GF2381@work-vm \
    --to=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=gaofeng@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).