From: Ori Mamluk <omamluk@zerto.com>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
Roni Luxenberg <rluxenbe@redhat.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
dlaor@redhat.com, Anthony Liguori <anthony@codemonkey.ws>,
Oded Kedem <oded@zerto.com>, Yair Kuszpet <yairk@zerto.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: [Qemu-devel] [RFC PATCH v3 6/9] repagent: Updated documentation in qemu-repagent.txt. Renamed rephub read command to remoteIo.
Date: Thu, 5 Apr 2012 15:17:58 +0300 [thread overview]
Message-ID: <24dda3a956e11856b0439825366f186e@mail.gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 22827 bytes --]
Sent 'qemu-devel@nongnu.org'as repagent patch V2
---
block/repagent/qemu-repagent.txt | 147
+++++++++++++++-----------------------
block/repagent/repagent.c | 71 ++++++++++--------
block/repagent/repagent.h | 4 +-
block/repagent/repagent_client.c | 4 +-
block/repagent/repcmd.h | 4 +-
block/repagent/rephub_cmds.h | 27 ++++---
6 files changed, 119 insertions(+), 138 deletions(-)
diff --git a/block/repagent/qemu-repagent.txt
b/block/repagent/qemu-repagent.txt
index e3b0c1e..f8def3f 100644
--- a/block/repagent/qemu-repagent.txt
+++ b/block/repagent/qemu-repagent.txt
@@ -1,104 +1,73 @@
- repagent - replication agent - a Qemu module for enabling
continuous async replication of VM volumes
+ repagent - replication agent - a Qemu module for enabling continuous
async replication of VM volumes
Introduction
- This document describes a feature in Qemu - a replication
agent (AKA Repagent).
- The Repagent is a new module that exposes an API to an
external replication system (AKA Rephub).
- This API allows a Rephub to communicate with a Qemu VM and
continuously replicate its volumes.
- The imlementation of a Rephub is outside of the scope of
this document. There may be several various Rephub
- implenetations using the same repagent in Qemu.
+ This document describes a feature in Qemu - a replication agent (named
Repagent).
+ The Repagent is a new module that exposes an API to an external
replication system (AKA Rephub).
+ This API allows a Rephub to communicate with a Qemu VM and
continuously replicate its volumes.
+ The imlementation of a Rephub is outside of the scope of this
document. There may be several various Rephub
+ implenetations using the same repagent in Qemu.
+ The Repagent is storage driver that acts like a filter driver.
+ It can be regarded as a 'plugin' that is activated when the management
system enables replication.
Main feature of Repagent
- Repagent does the following:
- * Report volumes - report a list of all volumes in a VM to
the Rephub.
- * Report writes to a volume - send all writes made to a
protected volume to the Rephub.
- The reporting of an IO is asyncronuous -
i.e. the IO is not delayed by the Repagent to get any acknowledgement from
the Rephub.
- It is only copied to the Rephub.
- * Read a protected volume - allows the Rephub to read a
protected volume, to enable the protected hub to syncronize the content of
a protected volume.
+ Repagent has the following main features:
+ * Report volumes - report a list of all volumes in a VM to the Rephub.
+ * Mirror writes - Report writes to a volume - send all writes made to
a protected volume to the Rephub.
+ The reporting of an IO is asyncronuous - i.e. the IO is not
delayed by the Repagent to get any acknowledgement from the Rephub.
+ It is only copied to the Rephub.
+ * Remote IO - Read/write a volume - allows the Rephub to read a
protected volume, to enable the protected hub to syncronize
+ the content of a protected volume.
+ Also used to read/write to a recovery volume - the replica of a
protected volume.
Description of the Repagent module
Build and run options
- New configure option: --enable-replication
- New command line option:
- -repagent [hub IP/name]
-
Enable replication support for disks
-
hub is the ip or name of the machine running the replication hub.
+ New configure option: --enable-repagent
+ New command line option:
+ -repagent [hub IP/name]
+ Enable replication support for disks
+ hub is the ip or name of the machine running the
replication hub.
Module APIs
- The Repagent module interfaces two main components:
- 1. The Rephub - An external API based on socket messages
- 2. The generic block layer- block.c
-
- Rephub message API
- The external replication API is a message
based API.
- We won't go into the structure of the
messages here - just the sematics.
-
- Messages list
- (The updated list and
comments are in Rephub_cmds.h)
-
- Messages from the Repagent
to the Rephub:
- * Protected write
- The Repagent
sends each write to a protected volume to the hub with the IO status.
- In case the
status is bad the write content is not sent
- * Report VM volumes
- The agent
reports all the volumes of the VM to the hub.
- * Read Volume Response
- A response
to a Read Volume Request
- Sends the
data read from a protected volume to the hub
- * Agent shutdown
- Notifies the
hub that the agent is about to shutdown.
- This allows
a graceful shutdown. Any disconnection of an agent without
- sending this
command will result in a full sync of the VM volumes.
-
- Messages from the Rephub to
the Repagent:
- * Start protect
- The hub
instructs the agent to start protecting a volume. When a volume is protected
- all its
writes are sent to to the hub.
- With this
command the hub also assigns a volume ID to the given volume name.
- * Read volume request
- The hub
issues a read IO to a protected volume.
- This command
is used during sync - when the hub needs to read unsyncronized
- sections of
a protected volume.
- This command
is a request, the read data is returned by the read volume response message
(see above).
- block.c API
- The API to the generic block storage layer
contains 3 functionalities:
- 1. Handle writes to protected volumes
- In bdrv_co_do_writev, each
write is reported to the Repagent module.
- 2. Handle each new volume that registers
- In bdrv_open - each new
bottom-level block driver that registers is reported.
- 2. Read from a volume
- Repagent calls
bdrv_aio_readv to handle read requests coming from the hub.
+ The Repagent module interfaces two main components:
+ 1. The Rephub - An external API based on socket messages
+ See detailed comments about each message in rephub_cmds.h
+ 2. The generic block layer- block.c
+ Repagent is a block driver. Most of the block driver functions are
just a pass-through
+ to the next driver.
+ Writes are mirrors to the hub for replication
+ Open function is used for registering each volume in Repagent.
General description of a Rephub - a replication system the repagent
connects to
- This section describes in high level a sample Rephub - a
replication system that uses the repagent API
- to replicate disks.
- It describes a simple Rephub that comntinuously maintains a
mirror of the volumes of a VM.
-
- Say we have a VM we want to protect - call it PVM, say it
has 2 volumes - V1, V2.
- Our Rephub is called SingleRephub - a Rephub protecting a
single VM.
-
- Preparations
- 1. The user chooses a host to rub SingleRephub - a different
host than PVM, call it Host2
- 2. The user creates two volumes on Host2 - same sizes of V1
and V2, call them V1R (V1 recovery) and V2R.
- 3. The user runs SingleRephub process on Host2, and gives
V1R and V2R as command line arguments.
- From now on SingleRephub waits for the
protected VM repagent to connect.
- 4. The user runs the protected VM PVM - and uses the switch
-repagent <Host2 IP>.
-
- Runtime
- 1. The repagent module connects to SingleRephub on startup.
- 2. repagent reports V1 and V2 to SingleRephub.
- 3. SingleRephub starts to perform an initial synchronization
of the protected volumes-
- it reads each protected volume (V1 and V2) -
using read volume requests - and copies the data into the
- recovery volume V1R and V2R.
- 4. SingleRephub enters 'protection' mode - each write to the
protected volume is sent by the repagent to the Rephub,
- and the Rephub performs the write on the
matching recovery volume.
-
- * Note that during stage 3 writes to the protected volumes
are not ignored - they're kept in a bitmap,
- and will be read again when stage 3 ends, in
an interative convergin process.
-
- This flow continuously maintains an updated recovery volume.
- If the protected system is damaged, the user can create a
new VM on Host2 with the replicated volumes attached to it.
- The new VM is a replica of the protected system.
+ This section describes in high level a sample Rephub - a replication
system that uses the repagent API
+ to replicate disks.
+ It describes a simple Rephub that comntinuously maintains a mirror of
the volumes of a VM.
+
+ Say we have a VM we want to protect - call it PVM, say it has 2
volumes - V1, V2.
+ Our Rephub is called SingleRephub - a Rephub protecting a single VM.
+
+ Preparations
+ 1. The user chooses a host to rub SingleRephub - a different host than
PVM, call it Host2
+ 2. The user creates two volumes on Host2 - same sizes of V1 and V2,
call them V1R (V1 recovery) and V2R.
+ 3. The user runs SingleRephub process on Host2, and gives V1R and V2R
as command line arguments.
+ From now on SingleRephub waits for the protected VM repagent to
connect.
+ 4. The user runs the protected VM PVM - and uses the switch -repagent
<Host2 IP>.
+
+ Runtime
+ 1. The repagent module connects to SingleRephub on startup.
+ 2. repagent reports V1 and V2 to SingleRephub.
+ 3. SingleRephub starts to perform an initial synchronization of the
protected volumes-
+ it reads each protected volume (V1 and V2) - using read volume
requests - and copies the data into the
+ recovery volume V1R and V2R.
+ 4. SingleRephub enters 'protection' mode - each write to the protected
volume is sent by the repagent to the Rephub,
+ and the Rephub performs the write on the matching recovery volume.
+
+ * Note that during stage 3 writes to the protected volumes are not
ignored - they're kept in a bitmap,
+ and will be read again when stage 3 ends, in an interative
convergin process.
+
+ This flow continuously maintains an updated recovery volume.
+ If the protected system is damaged, the user can create a new VM on
Host2 with the replicated volumes attached to it.
+ The new VM is a replica of the protected system.
diff --git a/block/repagent/repagent.c b/block/repagent/repagent.c
index c3dd593..bdc0117 100644
--- a/block/repagent/repagent.c
+++ b/block/repagent/repagent.c
@@ -29,7 +29,7 @@ struct RepAgentState {
typedef struct RepagentReadVolIo {
QEMUIOVector qiov;
- RepCmdReadVolReq rep_cmd;
+ RepCmdRemoteIoReq rep_cmd;
uint8_t *buf;
struct timeval start_time;
} RepagentReadVolIo;
@@ -38,7 +38,7 @@ static int repagent_get_volume_by_driver(
BlockDriverState *bs);
static int repagent_get_volume_by_name(const char *name);
static void repagent_report_volumes_to_hub(void);
-static void repagent_vol_read_done(void *opaque, int ret);
+static void repagent_remote_io_done(void *opaque, int ret);
static struct timeval tsub(struct timeval t1, struct timeval t2);
RepAgentState g_rep_agent = { 0 };
@@ -242,15 +242,15 @@ static int repagent_get_volume_by_id(uint64_t vol_id)
return -1;
}
-int repaget_read_vol(RepCmdReadVolReq *pcmd, uint8_t *pdata)
+int repagent_remote_io(RepCmdRemoteIoReq *pcmd, uint8_t *pdata)
{
int index = repagent_get_volume_by_id(pcmd->volume_id);
int size_bytes = pcmd->size_sectors * 512;
if (index < 0) {
printf("Vol read - Could not find vol id %llx\n",
(unsigned long long int) pcmd->volume_id);
- RepCmdReadVolRes *p_res_cmd = (RepCmdReadVolRes *) repcmd_new(
- REPHUB_CMD_READ_VOL_RES, 0, NULL);
+ RepCmdRemoteIoRes *p_res_cmd = (RepCmdRemoteIoRes *) repcmd_new(
+ REPHUB_CMD_REMOTE_IO_RES, 0, NULL);
p_res_cmd->req_id = pcmd->req_id;
p_res_cmd->volume_id = pcmd->volume_id;
p_res_cmd->io_status = -1;
@@ -264,58 +264,67 @@ int repaget_read_vol(RepCmdReadVolReq *pcmd, uint8_t
*pdata)
(unsigned long long int) pcmd->offset_sectors,
pcmd->size_sectors);
{
- RepagentReadVolIo *read_xact = calloc(1,
sizeof(RepagentReadVolIo));
+ RepagentReadVolIo *io_xaction = calloc(1,
sizeof(RepagentReadVolIo));
/* BlockDriverAIOCB *acb; */
- ZERO_MEM_OBJ(read_xact);
+ ZERO_MEM_OBJ(io_xaction);
- qemu_iovec_init(&read_xact->qiov, 1);
+ qemu_iovec_init(&io_xaction->qiov, 1);
/*read_xact->buf =
qemu_blockalign(g_rep_agent.volumes[index]->driver_ptr,
size_bytes); */
- read_xact->buf = (uint8_t *) g_malloc(size_bytes);
- read_xact->rep_cmd = *pcmd;
- qemu_iovec_add(&read_xact->qiov, read_xact->buf, size_bytes);
+ io_xaction->buf = (uint8_t *) g_malloc(size_bytes);
+ io_xaction->rep_cmd = *pcmd;
+ qemu_iovec_add(&io_xaction->qiov, io_xaction->buf, size_bytes);
- gettimeofday(&read_xact->start_time, NULL);
+ gettimeofday(&io_xaction->start_time, NULL);
/* orim TODO - use the returned acb to cancel the request on
shutdown */
- /*acb = */bdrv_aio_readv(g_rep_agent.volumes[index]->driver_ptr,
- read_xact->rep_cmd.offset_sectors, &read_xact->qiov,
- read_xact->rep_cmd.size_sectors, repagent_vol_read_done,
- read_xact);
+ /*acb = */
+ if (pcmd->is_read) {
+ bdrv_aio_readv(g_rep_agent.volumes[index]->driver_ptr,
+ io_xaction->rep_cmd.offset_sectors, &io_xaction->qiov,
+ io_xaction->rep_cmd.size_sectors,
repagent_remote_io_done,
+ io_xaction);
+ } else {
+ bdrv_aio_writev(g_rep_agent.volumes[index]->driver_ptr,
+ io_xaction->rep_cmd.offset_sectors, &io_xaction->qiov,
+ io_xaction->rep_cmd.size_sectors,
repagent_remote_io_done,
+ io_xaction);
+ }
}
return TRUE;
}
-static void repagent_vol_read_done(void *opaque, int ret)
+static void repagent_remote_io_done(void *opaque, int ret)
{
struct timeval t2;
- RepagentReadVolIo *read_xact = (RepagentReadVolIo *) opaque;
+ RepagentReadVolIo *io_xaction = (RepagentReadVolIo *) opaque;
uint8_t *pdata = NULL;
- RepCmdReadVolRes *pcmd = (RepCmdReadVolRes *) repcmd_new(
- REPHUB_CMD_READ_VOL_RES, read_xact->rep_cmd.size_sectors * 512,
+ RepCmdRemoteIoRes *pcmd = (RepCmdRemoteIoRes *) repcmd_new(
+ REPHUB_CMD_REMOTE_IO_RES, io_xaction->rep_cmd.size_sectors *
512,
&pdata);
- pcmd->req_id = read_xact->rep_cmd.req_id;
- pcmd->volume_id = read_xact->rep_cmd.volume_id;
+ pcmd->req_id = io_xaction->rep_cmd.req_id;
+ pcmd->volume_id = io_xaction->rep_cmd.volume_id;
pcmd->io_status = -1;
- printf("Protected vol read - volId %llu, offset %llu, size %u\n",
- (unsigned long long int) read_xact->rep_cmd.volume_id,
- (unsigned long long int) read_xact->rep_cmd.offset_sectors,
- read_xact->rep_cmd.size_sectors);
+ printf("Remote IO request - volId %llu, offset %llu, size %u, is_read
%u\n",
+ (unsigned long long int) io_xaction->rep_cmd.volume_id,
+ (unsigned long long int) io_xaction->rep_cmd.offset_sectors,
+ io_xaction->rep_cmd.size_sectors,
+ io_xaction->rep_cmd.is_read);
gettimeofday(&t2, NULL);
if (ret >= 0) {
/* Read response - send the data to the hub */
- t2 = tsub(t2, read_xact->start_time);
- printf("Read prot vol done. Took %u seconds, %u us.",
+ t2 = tsub(t2, io_xaction->start_time);
+ printf("Remote IO done. Took %u seconds, %u us.",
(uint32_t) t2.tv_sec, (uint32_t) t2.tv_usec);
pcmd->io_status = 0; /* Success */
/* orim todo optimize - don't copy, use the qiov buffer */
- qemu_iovec_to_buffer(&read_xact->qiov, pdata);
+ qemu_iovec_to_buffer(&io_xaction->qiov, pdata);
} else {
printf("readv failed: %s\n", strerror(-ret));
}
@@ -323,9 +332,9 @@ static void repagent_vol_read_done(void *opaque, int
ret)
repagent_client_send((RepCmd *) pcmd);
/*qemu_vfree(read_xact->buf); */
- g_free(read_xact->buf);
+ g_free(io_xaction->buf);
- g_free(read_xact);
+ g_free(io_xaction);
}
static struct timeval tsub(struct timeval t1, struct timeval t2)
diff --git a/block/repagent/repagent.h b/block/repagent/repagent.h
index 310db0f..0f69820 100644
--- a/block/repagent/repagent.h
+++ b/block/repagent/repagent.h
@@ -30,7 +30,7 @@
typedef struct RepAgentState RepAgentState;
typedef struct RepCmdStartProtect RepCmdStartProtect;
typedef struct RepCmdDataStartProtect RepCmdDataStartProtect;
-struct RepCmdReadVolReq;
+struct RepCmdRemoteIoReq;
/* orim temporary */
extern int use_repagent;
@@ -45,7 +45,7 @@ void repagent_deregister_drive(const char *drive_path,
BlockDriverState *driver_ptr);
int repaget_start_protect(RepCmdStartProtect *pcmd,
RepCmdDataStartProtect *pcmd_data);
-int repaget_read_vol(struct RepCmdReadVolReq *pcmd, uint8_t *pdata);
+int repagent_remote_io(struct RepCmdRemoteIoReq *pcmd, uint8_t *pdata);
void repagent_client_connected(void);
diff --git a/block/repagent/repagent_client.c
b/block/repagent/repagent_client.c
index 9ed8485..ee4aeb7 100644
--- a/block/repagent/repagent_client.c
+++ b/block/repagent/repagent_client.c
@@ -125,8 +125,8 @@ void repagent_process_cmd(RepCmd *pcmd, uint8_t *pdata,
void *clientPtr)
(RepCmdDataStartProtect *) pdata);
}
break;
- case REPHUB_CMD_READ_VOL_REQ: {
- is_free_data = repaget_read_vol((RepCmdReadVolReq *) pcmd, pdata);
+ case REPHUB_CMD_REMOTE_IO_REQ: {
+ is_free_data = repagent_remote_io((RepCmdRemoteIoReq *) pcmd,
pdata);
}
break;
default:
diff --git a/block/repagent/repcmd.h b/block/repagent/repcmd.h
index 8c6cf1b..0a7f297 100644
--- a/block/repagent/repcmd.h
+++ b/block/repagent/repcmd.h
@@ -36,8 +36,8 @@ enum RepCmds {
REPHUB_CMD_PROTECTED_WRITE = 2,
REPHUB_CMD_REPORT_VM_VOLUMES = 3,
REPHUB_CMD_START_PROTECT = 4,
- REPHUB_CMD_READ_VOL_REQ = 5,
- REPHUB_CMD_READ_VOL_RES = 6,
+ REPHUB_CMD_REMOTE_IO_REQ = 5,
+ REPHUB_CMD_REMOTE_IO_RES = 6,
REPHUB_CMD_AGENT_SHUTDOWN = 7,
};
diff --git a/block/repagent/rephub_cmds.h b/block/repagent/rephub_cmds.h
index d1bad06..cb737e6 100644
--- a/block/repagent/rephub_cmds.h
+++ b/block/repagent/rephub_cmds.h
@@ -97,40 +97,43 @@ typedef struct RepCmdDataStartProtect {
/*********************************************************
- * RepCmd Read Volume Request
+ * RepCmd Remote IO Request
*
- * REPHUB_CMD_READ_VOL_REQ
+ * REPHUB_CMD_REMOTE_IO_REQ
* Direction: hub->agent
*
- * The hub issues a read IO to a protected volume.
- * This command is used during sync - when the hub needs
- * to read unsyncronized sections of a protected volume.
+ * The hub issues an IO to a volume.
+ * This command is used for:
+ * - Reading protected volume during sync
+ * - Read/write to a recovery volume during
+ * protect and failover or failover test
* This command is a request, the read data is returned
- * by the response command REPHUB_CMD_READ_VOL_RES
+ * by the response command REPHUB_CMD_REMOTE_IO_RES
*********************************************************/
-typedef struct RepCmdReadVolReq {
+typedef struct RepCmdRemoteIoReq {
RepCmdHdr hdr;
int req_id;
int size_sectors;
uint64_t volume_id;
uint64_t offset_sectors;
-} RepCmdReadVolReq;
+ int is_read;
+} RepCmdRemoteIoReq;
/*********************************************************
* RepCmd Read Volume Response
*
- * REPHUB_CMD_READ_VOL_RES
+ * REPHUB_CMD_REMOTE_IO_RES
* Direction: agent->hub
*
- * A response to REPHUB_CMD_READ_VOL_REQ.
+ * A response to REPHUB_CMD_REMOTE_IO_REQ.
* Sends the data read from a protected volume
*********************************************************/
-typedef struct RepCmdReadVolRes {
+typedef struct RepCmdRemoteIoRes {
RepCmdHdr hdr;
int req_id;
int io_status;
uint64_t volume_id;
-} RepCmdReadVolRes;
+} RepCmdRemoteIoRes;
/*********************************************************
* RepCmd Agent shutdown
--
1.7.6.5
[-- Attachment #2: Type: text/html, Size: 34904 bytes --]
reply other threads:[~2012-04-05 12:18 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=24dda3a956e11856b0439825366f186e@mail.gmail.com \
--to=omamluk@zerto.com \
--cc=anthony@codemonkey.ws \
--cc=dlaor@redhat.com \
--cc=kwolf@redhat.com \
--cc=oded@zerto.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rluxenbe@redhat.com \
--cc=stefanha@gmail.com \
--cc=yairk@zerto.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).