All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Campbell <ian.campbell@citrix.com>
To: Wen Congyang <wency@cn.fujitsu.com>,
	xen devel <xen-devel@lists.xen.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca>,
	Wei Liu <wei.liu2@citrix.com>,
	Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes
Date: Tue, 19 Jan 2016 11:01:25 +0000	[thread overview]
Message-ID: <1453201285.29930.14.camel@citrix.com> (raw)
In-Reply-To: <569D8ACF.30508@cn.fujitsu.com>

On Tue, 2016-01-19 at 09:01 +0800, Wen Congyang wrote:
> On 01/19/2016 12:51 AM, Ian Campbell wrote:
> > On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> > > For example: if the secondary host is down, and we fail to send the
> > > data to
> > > the secondary host. xc_domain_save() returns 0. So in the function
> > > libxl__xc_domain_save_done(), rc is 0(the helper program exits
> > > normally),
> > > and retval is 0(it is xc_domain_save()'s return value). In such case,
> > > we
> > > just need to complete the stream.
> > 
> > What if the secondary host isn't actually down but just communication
> > has
> > failed for some reason? Won't both primary and secondary start their
> > respective versions of the domain? What are the consequences of that?
> > (Corruption?)
> > 
> > I suppose this is a consequence of the lack of STONITH or splitbrain
> > handling within Remus. Are there any plans to address this?
> 
> IIRC, Shriram Rajagopalan has some ideas about it(check the external heartbeat?).
> There is no way to avoid splitbrain unless we have more than two hosts(at least
> three hosts). If we want to avoid splitbrain, we may need to destroy both primary
> and secondary guests.

I think there's plenty of existing systems for taking care of this side of
fault-tolerance/HA (e.g. linux-ha, Pacemaker, Corosync, etc), we don't need
(or want) to reinvent that particular wheel here.

I think we just need a story on how one would integrate with such a system
in order to say that Remus is properly usable in real world scenarios (i.e.
before we can remove the "proof-of-concept" wording from the man page).

That might just be a documentation exercise, or it might require some hooks
etc adding to (lib)xl in order to allow such integrations, I'm not sure
what's needed.

IIRC Ian expressed a similar sentiment when Remus support was first added
to libxl.

Ian.

  reply	other threads:[~2016-01-19 11:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
2016-01-18 16:45   ` Ian Campbell
2016-01-19  1:05     ` Wen Congyang
2016-01-18  5:40 ` [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes Wen Congyang
2016-01-18 16:51   ` Ian Campbell
2016-01-19  1:01     ` Wen Congyang
2016-01-19 11:01       ` Ian Campbell [this message]
2016-01-18  5:40 ` [PATCH v4 3/5] tools/libxc: don't send end record if remus fails Wen Congyang
2016-01-18 16:53   ` Ian Campbell
2016-01-18 16:53     ` Ian Campbell
2016-01-18  5:40 ` [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback Wen Congyang
2016-01-18 16:53   ` Ian Campbell
2016-01-18  5:40 ` [PATCH v4 5/5] tools/libxl: remove unused function libxl__domain_save_device_model() Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453201285.29930.14.camel@citrix.com \
    --to=ian.campbell@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=hongyang.yang@easystack.cn \
    --cc=ian.jackson@eu.citrix.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wei.liu2@citrix.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xiecl.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.