From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, Lai Jiangshan <laijs@cn.fujitsu.com>,
quintela@redhat.com, GuiJianfeng@cn.fujitsu.com,
yunhong.jiang@intel.com, eddie.dong@intel.com,
dgilbert@redhat.com, mrhines@linux.vnet.ibm.com,
stefanha@redhat.com, Amit Shah <amit.shah@redhat.com>,
walid.nouri@gmail.com
Subject: Re: [Qemu-devel] [PATCH RESEND 0/2] PoC: Block replication for continuous checkpointing
Date: Tue, 30 Dec 2014 15:52:38 +0800 [thread overview]
Message-ID: <54A259C6.2090205@cn.fujitsu.com> (raw)
In-Reply-To: <549ECEF2.7080302@redhat.com>
Hi Paolo,
Thank you very much for the reply, you certainly get what we are going to
do, we are investigating Quorum and NBD, and will give you replies in a few
days.
Thanks again,
Yang.
在 12/27/2014 11:23 PM, Paolo Bonzini 写道:
>
>
> On 26/12/2014 04:31, Yang Hongyang wrote:
>> Please feel free to comment.
>> We want comments/feedbacks as many as possiable please, thanks in advance.
>
> Hi Yang,
>
> I think it's possible to build COLO block replication from many basic
> blocks that are already in QEMU. The only new piece would be the disk
> buffer on the secondary.
>
> virtio-blk ||
> ^ || .----------
> | || | Secondary
> 1 Quorum || '----------
> / \ ||
> / \ ||
> Primary 2 NBD -------> 2 NBD
> disk client || server virtio-blk
> || ^ ^
> --------. || | |
> Primary | || Secondary disk <--------- COLO buffer 3
> --------' || backing
>
>
> 1) The disk on the primary is represented by a block device with two
> children, providing replication between a primary disk and the host that
> runs the secondary VM. The read pattern patches for quorum
> (http://lists.gnu.org/archive/html/qemu-devel/2014-08/msg02381.html) can
> be used/extended to make the primary always read from the local disk
> instead of going through NBD.
>
> 2) The secondary disk receives writes from the primary VM through QEMU's
> embedded NBD server (speculative write-through).
>
> 3) The disk on the secondary is represented by a custom block device
> ("COLO buffer"). The disk buffer's backing image is the secondary disk,
> and the disk buffer uses bdrv_add_before_write_notifier to implement
> copy-on-write, similar to block/backup.c.
>
> 4) Checkpointing can use new bdrv_prepare_checkpoint and
> bdrv_do_checkpoint members in BlockDriver to discard the COLO buffer,
> similar to your patches (you did not explain why you do checkpointing in
> two steps). Failover instead is done with bdrv_commit or can even be
> done without stopping the secondary (live commit, block/commit.c).
>
>
> The missing parts are:
>
> 1) NBD server on the backing image of the COLO buffer. This means the
> backing image needs its own BlockBackend. Apart for this, no new
> infrastructure is needed to receive writes on the secondary.
>
> 2) Read pattern support for quorum need to be extended for the needs of
> the COLO primary. It may be simpler or faster to write a simple
> "replication" driver that writes to N children but always reads from the
> first. But in any case initial tests can be done with the quorum
> driver, even without read pattern support. Again, all the network
> infrastructure to replicate writes already exists in QEMU.
>
> 3) Of course the disk buffer itself.
>
> Paolo
>
>> Thanks,
>> Yang.
>>
>> Wen Congyang (1):
>> PoC: Block replication for COLO
>>
>> Yang Hongyang (1):
>> Block: Block replication design for COLO
>>
>> block.c | 48 +++++++
>> block/blkcolo.c | 338 ++++++++++++++++++++++++++++++++++++++++++++++
>> docs/blkcolo.txt | 85 ++++++++++++
>> include/block/block.h | 6 +
>> include/block/block_int.h | 21 +++
>> 5 files changed, 498 insertions(+)
>> create mode 100644 block/blkcolo.c
>> create mode 100644 docs/blkcolo.txt
>>
> .
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2014-12-30 7:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-26 3:31 [Qemu-devel] [PATCH RESEND 0/2] PoC: Block replication for continuous checkpointing Yang Hongyang
2014-12-26 3:31 ` [Qemu-devel] [PATCH RESEND 1/2] Block: Block replication design for COLO Yang Hongyang
2015-03-25 16:06 ` Eric Blake
2015-03-25 16:11 ` Eric Blake
2014-12-26 3:31 ` [Qemu-devel] [PATCH RESEND 2/2] PoC: Block replication " Yang Hongyang
2014-12-27 15:23 ` [Qemu-devel] [PATCH RESEND 0/2] PoC: Block replication for continuous checkpointing Paolo Bonzini
2014-12-30 7:52 ` Hongyang Yang [this message]
2015-01-05 10:44 ` Dr. David Alan Gilbert
2015-01-06 1:28 ` Wen Congyang
2015-01-09 9:31 ` Hongyang Yang
2015-01-28 6:42 ` Wen Congyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54A259C6.2090205@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=GuiJianfeng@cn.fujitsu.com \
--cc=amit.shah@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=kwolf@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=mrhines@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=walid.nouri@gmail.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).