From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58391) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WNU3R-0001BX-Gt for qemu-devel@nongnu.org; Tue, 11 Mar 2014 17:17:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WNU3M-0006UO-Ce for qemu-devel@nongnu.org; Tue, 11 Mar 2014 17:17:41 -0400 Received: from lnantes-156-75-100-125.w80-12.abo.wanadoo.fr ([80.12.84.125]:56046 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WNU3M-0006TR-0G for qemu-devel@nongnu.org; Tue, 11 Mar 2014 17:17:36 -0400 Date: Tue, 11 Mar 2014 22:17:34 +0100 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20140311211733.GC3257@irqsave.net> References: <1394555780-27665-1-git-send-email-benoit.canet@irqsave.net> <1394555780-27665-3-git-send-email-benoit.canet@irqsave.net> <531F796D.3050404@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <531F796D.3050404@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 2/3] block: Add drive-mirror-replace command to repair quorum files. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz Cc: =?iso-8859-1?Q?Beno=EEt?= Canet , kwolf@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com The Tuesday 11 Mar 2014 =E0 22:00:29 (+0100), Max Reitz wrote : > On 11.03.2014 17:36, Beno=EEt Canet wrote: > >When a quorum file is totally destroyed (broken filer) the user can st= art a >=20 > *file I really meant filer as in file server: I will write NAS instead. >=20 > >drive-mirror job on the quorum block backend and then replace the brok= en > >quorum file with drive-mirror-replace given it has a node-name. > > > >Signed-off-by: Benoit Canet > >--- > > block.c | 6 +-- > > block/mirror.c | 115 ++++++++++++++++++++++++++++++++++++= ++++++++-- > > blockdev.c | 27 +++++++++++ > > include/block/block.h | 3 ++ > > include/block/block_int.h | 15 ++++++ > > qapi-schema.json | 33 +++++++++++++ > > qmp-commands.hx | 5 ++ > > trace-events | 1 + > > 8 files changed, 199 insertions(+), 6 deletions(-) > > > >diff --git a/block.c b/block.c > >index 9f2fe85..2354e5b 100644 > >--- a/block.c > >+++ b/block.c > >@@ -787,9 +787,9 @@ static int bdrv_open_flags(BlockDriverState *bs, i= nt flags) > > return open_flags; > > } > >-static int bdrv_assign_node_name(BlockDriverState *bs, > >- const char *node_name, > >- Error **errp) > >+int bdrv_assign_node_name(BlockDriverState *bs, > >+ const char *node_name, > >+ Error **errp) > > { >=20 > Before this patch, there was only a single caller to this function, > and it (obviously, due to the "static") resided in block.c as well > (it was part of bdrv_open_common(), to be precise). This caller used > this function to assign a node name to a previously unnamed BDS. > With this patch, there is a new caller > (mirror_activate_replace_target()) and apparently, the BDS it uses > this function on is unnamed as well (the call to > bdrv_open_backing_file() in mirror_complete() could have named it, > but there are not options given, thus, it will remain unnamed). >=20 > The point why I'm bringing this up is that bdrv_assign_node_name() > always adds the BDS to the graph_bdrv_states list if a new name is > given. Therefore, if the BDS was already named, it will be twice in > that list. This is now not an issue, as the BDS will have been > unnamed in any case before bdrv_assign_node_name() was called. I'd > however prefer some assertion in that latter function that the BDS > will not be added twice to the list; I guess a simple assertion that > it is unnamed (bs->node_name[0] =3D=3D '\0') will suffice. >=20 > > if (!node_name) { > > return 0; > >diff --git a/block/mirror.c b/block/mirror.c > >index 6dc84e8..8133ca5 100644 > >--- a/block/mirror.c > >+++ b/block/mirror.c > >@@ -49,6 +49,15 @@ typedef struct MirrorBlockJob { > > unsigned long *in_flight_bitmap; > > int in_flight; > > int ret; > >+ /* these four fields are used by drive-mirror-replace */ > >+ /* Must the code replace a target with the new mirror */ >=20 > Hm, probably better "The code must/has to/should replace a target > with the new mirror". >=20 > >+ bool must_replace; > >+ /* The block BDS to replace */ > >+ BlockDriverState *to_replace; > >+ /* the node-name of the new mirror BDS */ > >+ char *new_node_name; > >+ /* Used to block operations on the drive-mirror-replace target. *= / > >+ Error *replace_blocker; > > } MirrorBlockJob; > > typedef struct MirrorOp { > >@@ -299,6 +308,7 @@ static void coroutine_fn mirror_run(void *opaque) > > MirrorBlockJob *s =3D opaque; > > BlockDriverState *bs =3D s->common.bs; > > int64_t sector_num, end, sectors_per_chunk, length; > >+ BlockDriverState *to_replace; > > uint64_t last_pause_ns; > > BlockDriverInfo bdi; > > char backing_filename[1024]; > >@@ -477,11 +487,17 @@ immediate_exit: > > g_free(s->in_flight_bitmap); > > bdrv_release_dirty_bitmap(bs, s->dirty_bitmap); > > bdrv_iostatus_disable(s->target); > >+ /* Here we handle the drive-mirror-replace QMP command */ > >+ if (s->must_replace) { > >+ to_replace =3D s->to_replace; > >+ } else { > >+ to_replace =3D s->common.bs; > >+ } > > if (s->should_complete && ret =3D=3D 0) { > >- if (bdrv_get_flags(s->target) !=3D bdrv_get_flags(s->common.b= s)) { > >- bdrv_reopen(s->target, bdrv_get_flags(s->common.bs), NULL= ); > >+ if (bdrv_get_flags(s->target) !=3D bdrv_get_flags(to_replace)= ) { > >+ bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL); > > } > >- bdrv_swap(s->target, s->common.bs); > >+ bdrv_swap(s->target, to_replace); > > if (s->common.driver->job_type =3D=3D BLOCK_JOB_TYPE_COMMIT)= { > > /* drop the bs loop chain formed by the swap: break the = loop then > > * trigger the unref from the top one */ > >@@ -491,6 +507,11 @@ immediate_exit: > > bdrv_unref(p); > > } > > } > >+ if (s->must_replace) { > >+ bdrv_op_unblock_all(s->to_replace, s->replace_blocker); > >+ error_free(s->replace_blocker); > >+ bdrv_unref(s->to_replace); > >+ } > > bdrv_unref(s->target); > > block_job_completed(&s->common, ret); > > } > >@@ -513,6 +534,87 @@ static void mirror_iostatus_reset(BlockJob *job) > > bdrv_iostatus_reset(s->target); > > } > >+bool mirror_set_replace_target(BlockJob *job, const char *reference, > >+ bool has_new_node_name, > >+ const char *new_node_name, Error **err= p) > >+{ > >+ MirrorBlockJob *s =3D container_of(job, MirrorBlockJob, common); > >+ BlockDriverState *to_replace; > >+ > >+ /* We don't want to give too much power to the user as could resu= lt on >=20 > *as this could result in >=20 > >+ * BlockDriverState loops if used with something other than sync=3D= full. > >+ */ > >+ if (s->is_none_mode || s->base) { > >+ error_setg(errp, "Can only be used on a mirror done with sync= =3Dfull"); > >+ return false; > >+ } > >+ > >+ /* check target reference is not an empty string */ >=20 > *check that the target reference >=20 > >+ if (!reference[0]) { > >+ error_setg(errp, "target-reference is an empty string"); > >+ return false; > >+ } > >+ > >+ /* Get the to replace block driver state */ >=20 > It's correct, but I find it hard to understand at the first read. I > think "Get the block driver state to be replaced" would be easier. >=20 > >+ to_replace =3D bdrv_lookup_bs(reference, reference, errp); > >+ if (!to_replace) { > >+ return false; > >+ } > >+ > >+ /* If the BDS to replace is a regular node we need a new node nam= e */ >=20 > Likewise, s/to replace/to be replaced/? >=20 > >+ if (to_replace->node_name[0] && !has_new_node_name) { > >+ error_setg(errp, "A new-node-name must be provided"); > >+ return false; > >+ } > >+ > >+ /* Can only replace something else than the source of the mirror = */ > >+ if (to_replace =3D=3D job->bs) { > >+ error_setg(errp, "Cannot replace the mirror source"); > >+ return false; > >+ } > >+ > >+ /* are this bs replace operation blocked */ >=20 > s/are/is/ >=20 > >+ if (bdrv_op_is_blocked(to_replace, BLOCK_OP_TYPE_REPLACE, errp)) = { > >+ return false; > >+ } > >+ > >+ /* save usefull infos for later */ >=20 > *useful >=20 > >+ s->to_replace =3D to_replace; > >+ assert(has_new_node_name); > >+ s->new_node_name =3D g_strdup(new_node_name); > >+ > >+ return true; > >+} > >+ > >+static void mirror_activate_replace_target(BlockJob *job, Error **err= p) > >+{ > >+ MirrorBlockJob *s =3D container_of(job, MirrorBlockJob, common); > >+ > >+ /* Set the new node name if the BDS to replace is a regular node > >+ * of the graph. > >+ */ > >+ if (s->to_replace->node_name[0]) { > >+ assert(s->new_node_name); > >+ bdrv_assign_node_name(s->target, s->new_node_name, errp); > >+ g_free(s->new_node_name); > >+ } >=20 > Is s->new_node_name leaked if !s->to_replace->node_name[0]? >=20 > >+ > >+ if (error_is_set(errp)) { > >+ s->to_replace =3D NULL; > >+ return; > >+ } > >+ > >+ /* block all operations on the target bs */ > >+ error_setg(&s->replace_blocker, > >+ "block device is in use by drive-mirror-replace"); > >+ bdrv_op_block_all(s->to_replace, s->replace_blocker); > >+ > >+ bdrv_ref(s->to_replace); > >+ > >+ /* activate the replacement operation */ > >+ s->must_replace =3D true; > >+} > >+ > > static void mirror_complete(BlockJob *job, Error **errp) > > { > > MirrorBlockJob *s =3D container_of(job, MirrorBlockJob, common); > >@@ -529,6 +631,13 @@ static void mirror_complete(BlockJob *job, Error = **errp) > > return; > > } > >+ /* drive-mirror-replace is being called on this job so activate t= he > >+ * replacement target > >+ */ > >+ if (s->to_replace) { > >+ mirror_activate_replace_target(job, errp); > >+ } > >+ > > s->should_complete =3D true; > > block_job_resume(job); > > } > >diff --git a/blockdev.c b/blockdev.c > >index 8a6ae0a..901a05d 100644 > >--- a/blockdev.c > >+++ b/blockdev.c > >@@ -2365,6 +2365,33 @@ void qmp_block_job_complete(const char *device,= Error **errp) > > block_job_complete(job, errp); > > } > >+void qmp_drive_mirror_replace(const char *device, const char *target_= reference, > >+ bool has_new_node_name, > >+ const char *new_node_name, > >+ Error **errp) > >+{ > >+ BlockJob *job =3D find_block_job(device); > >+ > >+ if (!job) { > >+ error_set(errp, QERR_BLOCK_JOB_NOT_ACTIVE, device); > >+ return; > >+ } > >+ > >+ if (!job->driver || job->driver->job_type !=3D BLOCK_JOB_TYPE_MIR= ROR) { > >+ error_setg(errp, "Can only be used on a drive-mirror block jo= b"); > >+ return; > >+ } > >+ > >+ if (!mirror_set_replace_target(job, target_reference, has_new_nod= e_name, > >+ new_node_name, errp)) { > >+ return; > >+ } > >+ > >+ trace_qmp_drive_mirror_replace(job, target_reference, > >+ has_new_node_name ? new_node_name = : ""); > >+ block_job_complete(job, errp); > >+} > >+ > > void qmp_blockdev_add(BlockdevOptions *options, Error **errp) > > { > > QmpOutputVisitor *ov =3D qmp_output_visitor_new(); > >diff --git a/include/block/block.h b/include/block/block.h > >index ee1582d..a9d6ead 100644 > >--- a/include/block/block.h > >+++ b/include/block/block.h > >@@ -171,6 +171,7 @@ typedef enum BlockOpType { > > BLOCK_OP_TYPE_MIRROR, > > BLOCK_OP_TYPE_RESIZE, > > BLOCK_OP_TYPE_STREAM, > >+ BLOCK_OP_TYPE_REPLACE, > > BLOCK_OP_TYPE_MAX, > > } BlockOpType; > >@@ -209,6 +210,8 @@ int bdrv_open_image(BlockDriverState **pbs, const = char *filename, > > QDict *options, const char *bdref_key, int flags= , > > bool allow_none, Error **errp); > > void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *bac= king_hd); > >+int bdrv_assign_node_name(BlockDriverState *bs, const char *node_name= , > >+ Error **errp); > > int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Err= or **errp); > > int bdrv_open(BlockDriverState **pbs, const char *filename, > > const char *reference, QDict *options, int flags, > >diff --git a/include/block/block_int.h b/include/block/block_int.h > >index 1f4f78b..ea281c8 100644 > >--- a/include/block/block_int.h > >+++ b/include/block/block_int.h > >@@ -486,6 +486,21 @@ void mirror_start(BlockDriverState *bs, BlockDriv= erState *target, > > void *opaque, Error **errp); > > /* > >+ * mirror_set_replace_target: > >+ * @job: An active mirroring block job started with sync=3Dfull. > >+ * @reference: id or node-name of the BDS to replace when the mirror = is done. > >+ * @has_new_node_name: Set to true if new_node_name if provided > >+ * @new_node_name: The optional new node name of the new mirror. > >+ * @errp: Error object. > >+ * > >+ * Prepare a mirroring operation to replace a BDS pointed to by refer= ence with > >+ * the new mirror. > >+ */ > >+bool mirror_set_replace_target(BlockJob *job, const char *reference, > >+ bool has_new_node_name, > >+ const char *new_node_name, Error **err= p); > >+ > >+/* > > * backup_start: > > * @bs: Block device to operate on. > > * @target: Block device to write to. > >diff --git a/qapi-schema.json b/qapi-schema.json > >index f5a8767..33c5ab1 100644 > >--- a/qapi-schema.json > >+++ b/qapi-schema.json > >@@ -2716,6 +2716,39 @@ > > { 'command': 'block-job-complete', 'data': { 'device': 'str' } } > > ## > >+# @drive-mirror-replace: > >+# > >+# Manually trigger completion of an active background drive-mirror op= eration > >+# and replace the target reference with the new mirror. > >+# This switch the device to write to the target path only. >=20 > *switches >=20 > >+# The ability to complete is signaled with a BLOCK_JOB_READY event. > >+# > >+# This command completes an active drive-mirror background operation > >+# synchronously and replace the target reference with the mirror. >=20 > *replaces >=20 > Max >=20 > >+# The ordering of this command's return with the BLOCK_JOB_COMPLETED = event > >+# is not defined. Note that if an I/O error occurs during the proces= sing of > >+# this command: 1) the command itself will fail; 2) the error will be= processed > >+# according to the rerror/werror arguments that were specified when s= tarting > >+# the operation. > >+# > >+# A cancelled or paused drive-mirror job cannot be completed. > >+# > >+# @device: the device name > >+# @target-reference: the id or node name of the block driver state to= replace > >+# @new-node-name: #optional set the node-name of the new block dri= ver state > >+# needed the target reference point to a regular n= ode of the > >+# graph > >+# > >+# Returns: Nothing on success > >+# If no background operation is active on this device, Devic= eNotActive > >+# > >+# Since: 2.1 > >+## > >+{ 'command': 'drive-mirror-replace', > >+ 'data': { 'device': 'str', 'target-reference': 'str', > >+ '*new-node-name': 'str' } } > >+ > >+## > > # @ObjectTypeInfo: > > # > > # This structure describes a search result from @qom-list-types > >diff --git a/qmp-commands.hx b/qmp-commands.hx > >index dd336f7..3b5b382 100644 > >--- a/qmp-commands.hx > >+++ b/qmp-commands.hx > >@@ -1150,6 +1150,11 @@ EQMP > > .mhandler.cmd_new =3D qmp_marshal_input_block_job_complete, > > }, > > { > >+ .name =3D "drive-mirror-replace", > >+ .args_type =3D "device:B,target-reference:s,new-node-name:s?= ", > >+ .mhandler.cmd_new =3D qmp_marshal_input_drive_mirror_replace, > >+ }, > >+ { > > .name =3D "transaction", > > .args_type =3D "actions:q", > > .mhandler.cmd_new =3D qmp_marshal_input_transaction, > >diff --git a/trace-events b/trace-events > >index aec4202..5282904 100644 > >--- a/trace-events > >+++ b/trace-events > >@@ -103,6 +103,7 @@ qmp_block_job_cancel(void *job) "job %p" > > qmp_block_job_pause(void *job) "job %p" > > qmp_block_job_resume(void *job) "job %p" > > qmp_block_job_complete(void *job) "job %p" > >+qmp_drive_mirror_replace(void *job, const char *target_reference, con= st char *new_node_name) "job %p target_reference %s new_node_name %s" > > block_job_cb(void *bs, void *job, int ret) "bs %p job %p ret %d" > > qmp_block_stream(void *bs, void *job) "bs %p job %p" >=20