qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Cc: qemu devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Fam Zheng <famz@redhat.com>, Max Reitz <mreitz@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Jeff Cody <jcody@redhat.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu block <qemu-block@nongnu.org>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Gonglei <arei.gonglei@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v19 00/10] Block replication for continuous checkpoints
Date: Mon, 30 May 2016 11:20:16 -0700	[thread overview]
Message-ID: <20160530182016.GC1366@stefanha-x1.localdomain> (raw)
In-Reply-To: <1463729780-31982-1-git-send-email-xiecl.fnst@cn.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]

On Fri, May 20, 2016 at 03:36:10PM +0800, Changlong Xie wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> You can get the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
> 
> You can get the patch with framework here:
> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>    are accepted.
> 
> Changs Log:
> V19:
> 1. Rebase to v2.6.0
> 2. Address comments from stefan
> p3: a new patch that export interfaces for extra serialization
> p8: 
> 1. call replication_stop() before freeing s->top_id
> 2. check top_bs
> 3. reopen file readonly in error return paths
> 4. enable extra serialization between read and COW
> p9: try to hanlde SIGABRT
> V18:
> p6: add local_err in all replication callbacks to prevent "errp == NULL"
> p7: add missing qemu_iovec_destroy(xxx)
> V17:
> 1. Rebase to the lastest codes 
> p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
> p4: fix bugs in "drive_add buddy xxx" hmp commands
> p6: add "since: 2.7"
> p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication 
> p8: add "since: 2.7"
> V16:
> 1. Rebase to the newest codes
> 2. Address comments from Stefan & hailiang
> p3: we don't need this patch now
> p4: add "top-id" parameters for secondary
> p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, 
> add doc comments that explain the semantics of Replication
> p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
> *Note*: I'm working on replication testcase now, will send out in V17
> V15:
> 1. Rebase to the newest codes
> 2. Fix typos and coding style addresed Eric's comments
> 3. Address Stefan's comments
>    1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
>    2) Update the message and description for [PATCH 4/9]
>    3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
>    4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
>    5) Use BdrvChild instead of holding on to BlockDriverState * pointers
> 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771  
> 5. Introduce replication_get_error_all to check replication status
> 6. Remove useless discard interface
> V14:
> 1. Implement auto complete active commit
> 2. Implement active commit block job for replication.c
> 3. Address the comments from Stefan, add replication-specific API and data
>    structure, also remove old block layer APIs
> V13:
> 1. Rebase to the newest codes
> 2. Remove redundant marcos and semicolon in replication.c 
> 3. Fix typos in block-replication.txt
> V12:
> 1. Rebase to the newest codes
> 2. Use backing reference to replcace 'allow-write-backing-file'
> V11:
> 1. Reopen the backing file when starting blcok replication if it is not
>    opened in R/W mode
> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>    when opening backing file
> 3. Block the top BDS so there is only one block job for the top BDS and
>    its backing chain.
> V10:
> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>    reference.
> 2. Address the comments from Eric Blake
> V9:
> 1. Update the error messages
> 2. Rebase to the newest qemu
> 3. Split child add/delete support. These patches are sent in another patchset.
> V8:
> 1. Address Alberto Garcia's comments
> V7:
> 1. Implement adding/removing quorum child. Remove the option non-connect.
> 2. Simplify the backing refrence option according to Stefan Hajnoczi's suggestion
> V6:
> 1. Rebase to the newest qemu.
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>    if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> 
> Changlong Xie (3):
>   Backup: export interfaces for extra serialization
>   Introduce new APIs to do replication operation
>   tests: add unit test case for replication
> 
> Wen Congyang (7):
>   unblock backup operations in backing file
>   Backup: clear all bitmap when doing block checkpoint
>   Link backup into block core
>   docs: block replication's description
>   auto complete active commit
>   Implement new driver for block replication
>   support replication driver in blockdev-add
> 
>  Makefile.objs                |   1 +
>  block.c                      |  17 ++
>  block/Makefile.objs          |   3 +-
>  block/backup.c               |  59 +++-
>  block/mirror.c               |  13 +-
>  block/replication.c          | 666 +++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                   |   2 +-
>  docs/block-replication.txt   | 239 ++++++++++++++++
>  include/block/block_backup.h |  17 ++
>  include/block/block_int.h    |   3 +-
>  qapi/block-core.json         |  33 ++-
>  qemu-img.c                   |   2 +-
>  replication.c                | 105 +++++++
>  replication.h                | 176 ++++++++++++
>  tests/.gitignore             |   1 +
>  tests/Makefile               |   4 +
>  tests/test-replication.c     | 523 +++++++++++++++++++++++++++++++++
>  17 files changed, 1847 insertions(+), 17 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
>  create mode 100644 include/block/block_backup.h
>  create mode 100644 replication.c
>  create mode 100644 replication.h
>  create mode 100644 tests/test-replication.c

I have reviewed many revisions of this series.  The main mechanism in
this series makes sense to me.

I'm still concerned that checkpointing (vm_stop(), not in this series
but COLO in general) depends on bdrv_drain(), which can block forever if
I/O is hung.  That doesn't seem like a reasonable limitation for a high
availability feature since it may lead to the VM becoming unavailable.

I'd like Jeff and/or Kevin to review this series and merge it once they
are happy.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2016-05-30 18:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 04/10] Link backup into block core Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 05/10] docs: block replication's description Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 06/10] auto complete active commit Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31  1:20     ` Changlong Xie
2016-06-07  4:59   ` [Qemu-devel] " Changlong Xie
2016-06-07  5:36   ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
2016-05-27  1:46   ` Changlong Xie
2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31 10:21     ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add Changlong Xie
2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-27  7:23   ` Fam Zheng
2016-05-30 18:20 ` Stefan Hajnoczi [this message]
2016-05-31 10:25   ` [Qemu-devel] [Qemu-block] " Changlong Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160530182016.GC1366@stefanha-x1.localdomain \
    --to=stefanha@gmail.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).