public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Status of Fault Tolerance feature?
@ 2013-01-21 12:24 Andres Toomsalu
  2013-01-28 22:46 ` Brian Jackson
  0 siblings, 1 reply; 5+ messages in thread
From: Andres Toomsalu @ 2013-01-21 12:24 UTC (permalink / raw)
  To: kvm

Hi,

Could anyone shed a light what happened to Kemari project and are there any upcoming development planned in order to provide continous non-blocking VM checkpointing and VM HA with state replication?

Kind regards,
-- 
----------------------------------------------
Andres Toomsalu, andres@opennodecloud.com
http://www.opennodecloud.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Status of Fault Tolerance feature?
  2013-01-21 12:24 Status of Fault Tolerance feature? Andres Toomsalu
@ 2013-01-28 22:46 ` Brian Jackson
  2013-01-29 13:18   ` Andres Toomsalu
       [not found]   ` <0AA77D40-C291-47B0-9F42-04F65C196A25@opennodecloud.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Brian Jackson @ 2013-01-28 22:46 UTC (permalink / raw)
  To: Andres Toomsalu; +Cc: kvm

On Mon, 21 Jan 2013 14:24:12 +0200
Andres Toomsalu <andres@opennodecloud.com> wrote:

> Hi,
> 
> Could anyone shed a light what happened to Kemari project and are
> there any upcoming development planned in order to provide continous
> non-blocking VM checkpointing and VM HA with state replication?
> 
> Kind regards,

The project hasn't been actively developed in years and there has been
no public information about it in probably longer. So state is
"unknown".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Status of Fault Tolerance feature?
  2013-01-28 22:46 ` Brian Jackson
@ 2013-01-29 13:18   ` Andres Toomsalu
       [not found]   ` <0AA77D40-C291-47B0-9F42-04F65C196A25@opennodecloud.com>
  1 sibling, 0 replies; 5+ messages in thread
From: Andres Toomsalu @ 2013-01-29 13:18 UTC (permalink / raw)
  To: kvm

But is there any other projects in (planned) development with the same goal(s)?
Im just really puzzled that while QEMU/KVM being kind a mature solution already no true fault tolerance/HA solutions exist (Im aware about stateless HA solutions with RHCS etc stacks -  but its hardly the "true" HA) - and if I get it correctly - no real plans/development in that direction also near-term?

Kind regards,
-- 
----------------------------------------------
Andres Toomsalu, andres@opennodecloud.com
http://www.opennodecloud.com

On 29.01.2013, at 0:46, Brian Jackson wrote:

> On Mon, 21 Jan 2013 14:24:12 +0200
> Andres Toomsalu <andres@opennodecloud.com> wrote:
> 
>> Hi,
>> 
>> Could anyone shed a light what happened to Kemari project and are
>> there any upcoming development planned in order to provide continous
>> non-blocking VM checkpointing and VM HA with state replication?
>> 
>> Kind regards,
> 
> The project hasn't been actively developed in years and there has been
> no public information about it in probably longer. So state is
> "unknown".


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Status of Fault Tolerance feature?
       [not found]   ` <0AA77D40-C291-47B0-9F42-04F65C196A25@opennodecloud.com>
@ 2013-01-29 15:20     ` Brian Jackson
  2013-01-29 15:48       ` Andres Toomsalu
  0 siblings, 1 reply; 5+ messages in thread
From: Brian Jackson @ 2013-01-29 15:20 UTC (permalink / raw)
  To: Andres Toomsalu; +Cc: kvm

On Tue, 29 Jan 2013 15:16:13 +0200
Andres Toomsalu <andres@opennodecloud.com> wrote:

> But is there any other projects in (planned) development with the
> same goal(s)?


I haven't heard of any. But then again, a lot of things get developed
in secret and then dumped on the community.


> Im just really puzzled that while QEMU/KVM being kind a
> mature solution already no true fault tolerance/HA solutions exist
> (Im aware about stateless HA solutions with RHCS etc stacks -  but
> its hardly the "true" HA) - and if I get it correctly - no real
> plans/development in that direction also near-term?


Most people that I know that have tried similar solutions on other
products give up on it because the performance is abysmal. It's
generally faster and better tested to do this stuff at the application
layer.


> 
> Kind regards,


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Status of Fault Tolerance feature?
  2013-01-29 15:20     ` Brian Jackson
@ 2013-01-29 15:48       ` Andres Toomsalu
  0 siblings, 0 replies; 5+ messages in thread
From: Andres Toomsalu @ 2013-01-29 15:48 UTC (permalink / raw)
  To: Brian Jackson; +Cc: kvm


On 29.01.2013, at 17:20, Brian Jackson wrote:

> On Tue, 29 Jan 2013 15:16:13 +0200
> Andres Toomsalu <andres@opennodecloud.com> wrote:
> 
>> But is there any other projects in (planned) development with the
>> same goal(s)?
> 
> 
> I haven't heard of any. But then again, a lot of things get developed
> in secret and then dumped on the community.

Sure.

> 
> 
>> Im just really puzzled that while QEMU/KVM being kind a
>> mature solution already no true fault tolerance/HA solutions exist
>> (Im aware about stateless HA solutions with RHCS etc stacks -  but
>> its hardly the "true" HA) - and if I get it correctly - no real
>> plans/development in that direction also near-term?
> 
> 
> Most people that I know that have tried similar solutions on other
> products give up on it because the performance is abysmal. It's
> generally faster and better tested to do this stuff at the application
> layer.

I've been looking into (somewhat) hypervisor agnostic solutions - eg general linux checkpoint-restore solutions - like DMTCP (dmtcp.sf.net) and CRIU (criu.org).
There is actually proof-of-concept DMTCP implementation for KVM - described in this paper: http://arxiv.org/pdf/1212.1787v1.pdf
CRIU currently supports only linux containers (OpenVZ, LXC) - but probaly it would be possible to add support also for QEMU/KVM by similar approach as DMTCP KVM plugin does it.

If I understand correctly DMTCP approach is a kind of non-blocking solution with possibly acceptable perfomance tradeoff.
I would see great benefit of having kind of a "generic" checkpoint/restore mechanism with support for multiple hypervisors - which could be basis also for a future fault tolerance/HA solutions.  

> 
> 
>> 
>> Kind regards,
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-01-29 15:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-21 12:24 Status of Fault Tolerance feature? Andres Toomsalu
2013-01-28 22:46 ` Brian Jackson
2013-01-29 13:18   ` Andres Toomsalu
     [not found]   ` <0AA77D40-C291-47B0-9F42-04F65C196A25@opennodecloud.com>
2013-01-29 15:20     ` Brian Jackson
2013-01-29 15:48       ` Andres Toomsalu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox