From mboxrd@z Thu Jan  1 00:00:00 1970
From: Harry Butterworth <harry@hebutterworth.freeserve.co.uk>
Subject: Re: Is continuous replication of state possible?
Date: Sun, 09 Jan 2005 02:00:45 +0000
Message-ID: <1105236045.3134.66.camel@localhost>
References: <aa7b01ec05010623375fadc451@mail.gmail.com>
	 <20050107100119.GW8251@cl.cam.ac.uk>
	 <aa7b01ec05010816423567f2cd@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-admin@lists.sourceforge.net>
In-Reply-To: <aa7b01ec05010816423567f2cd@mail.gmail.com>
Sender: xen-devel-admin@lists.sourceforge.net
Errors-To: xen-devel-admin@lists.sourceforge.net
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.sourceforge.net?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.sourceforge.net>
List-Help: <mailto:xen-devel-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.sourceforge.net?subject=subscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum=xen-devel>
To: Ian.Pratt@cl.cam.ac.uk
Cc: xen-devel@lists.sourceforge.net
List-Id: xen-devel@lists.xenproject.org

Ian Pratt wrote:
> Software-implemented hardware fault-tolerance is on the Xen
> research roadmap.
>
> It basically just requires deterministic execution and event
> injection. Doing this for uniprocessor guests is fairly straight
> forward. Doing it for SMP guests (with decent performance) is
> going to be a huge challenge, as determinism is hard to achieve. We're
> looking in to it...

I thought a little about trying to implement a byzantine fault tolerant
SMP virtual machine using state replication.  I read Barbara Liskov's
group's work for some BFT distributed consensus protocols and there is
some good work out there on BFT routing in ad-hoc networks and a little
bit on BFT authentication but I didn't find much on the SMP issues.

I wanted to try out two approaches:

1) Putting each CPU into a separate BFT context and suffering the
consensus protocol overhead in inter-cpu communications.
I thought this would probably work OK for replicated servers which were
relatively near each other (for example on IBM's proposed CIB hardware)
but I was also interested in global replication for continuity through
disasters so I was considering
2) Assuming a standard execution rate for instructions and speculatively
executing them non-deterministically in parallel. Non-determinism is
only significant when communication between CPUs happens in the wrong
order (for example when an instruction speculatively assigned to a
particular time-step on one cpu writes to a cache line which has already
been read from by an instruction speculatively assigned to a later
time-step by another cpu) this must be detected at which point the local
replica has to roll-back and then roll forwards more carefully to get
the serialisation right. I think this would work better when the
replicas are separated geographically because there is no BFT protocol
overhead for inter-cpu communication.

In both cases, I was going to emulate CPUs rather than try to use the
real CPU or any hardware features.  With emulation, I wanted to try out
using a high emulated:real CPU ratio and/or deep pipelines to ameliorate
the inherent stall problem of the first approach.  Emulation seems to be
a requirement for investigating the second approach.

I think there's also a possibility of using dynamic translation to make
either of these approaches go faster. With dynamic translation, either
approach might get to within a small factor of the theoretical limit for
BFT.

Not good for scientific performance computing but the BFT, global
replication, massive parallelism and single system image features would
be great for a lot of business applications.

I'm very curious, do you have another approach? Any thoughts about
real-time?

Harry.


-- 
Harry Butterworth <harry@hebutterworth.freeserve.co.uk>


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt