* master - activation: Remove empty DM device when table fails to load.
@ 2014-05-28 15:17 Jonathan Brassow
2014-05-28 15:45 ` Zdenek Kabelac
0 siblings, 1 reply; 2+ messages in thread
From: Jonathan Brassow @ 2014-05-28 15:17 UTC (permalink / raw)
To: lvm-devel
Gitweb: http://git.fedorahosted.org/git/?p=lvm2.git;a=commitdiff;h=442820aae3648e1846417d1248fa36030eba4bd8
Commit: 442820aae3648e1846417d1248fa36030eba4bd8
Parent: 8212dac849e5a820c579c06a1b2805e7934a8d8b
Author: Jonathan Brassow <jbrassow@redhat.com>
AuthorDate: Wed May 28 10:17:15 2014 -0500
Committer: Jonathan Brassow <jbrassow@redhat.com>
CommitterDate: Wed May 28 10:17:15 2014 -0500
activation: Remove empty DM device when table fails to load.
As part of better error handling, remove DM devices that have been
sucessfully created but failed to load a table. This can happen
when pvmove'ing in a cluster and the cluster mirror daemon is not
running on a remote node - the mapping table failing to load as a
result. In this case, any revert would work on other nodes running
cmirrord because the DM devices on those nodes did succeed in loading.
However, because no table was able to load on the non-cmirrord nodes,
there is no table present that points to what needs to be reverted.
This causes the empty DM device to remain on the system without being
present in any LVM representation.
This patch should only be considered a partial fix to the overall
problem. This is because only the device which failed to load a
table is removed. Any LVs that may have been loaded as requirements
to the DM device that failed to load may be left in place. Complete
clean-up will require tracking those devices which have been created
as dependencies and removing them along with the device that failed
to load a table.
---
libdm/libdm-deptree.c | 59 ++++++++++++++++++++++++++++++++++++++++++++-----
1 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/libdm/libdm-deptree.c b/libdm/libdm-deptree.c
index 0a53755..f34c40f 100644
--- a/libdm/libdm-deptree.c
+++ b/libdm/libdm-deptree.c
@@ -1920,15 +1920,50 @@ static int _create_node(struct dm_tree_node *dnode)
if (!dm_task_no_open_count(dmt))
log_error("Failed to disable open_count");
- if ((r = dm_task_run(dmt)))
- r = dm_task_get_info(dmt, &dnode->info);
-
+ if ((r = dm_task_run(dmt))) {
+ if (!(r = dm_task_get_info(dmt, &dnode->info)))
+ /*
+ * This should not be possible to occur. However,
+ * we print an error message anyway for the more
+ * absurd cases (e.g. memory corruption) so there
+ * is never any question as to which one failed.
+ */
+ log_error(INTERNAL_ERROR
+ "Unable to get DM task info for %s.",
+ dnode->name);
+ }
out:
dm_task_destroy(dmt);
return r;
}
+/*
+ * _remove_node
+ *
+ * This function is only used to remove a DM device that has failed
+ * to load any table.
+ */
+static int _remove_node(struct dm_tree_node *dnode)
+{
+ if (!dnode->info.exists)
+ return 1;
+
+ if (dnode->info.live_table || dnode->info.inactive_table) {
+ log_error(INTERNAL_ERROR
+ "_remove_node called on device with loaded table(s).");
+ return 0;
+ }
+
+ if (!_deactivate_node(dnode->name, dnode->info.major, dnode->info.minor,
+ &dnode->dtree->cookie, dnode->udev_flags, 0)) {
+ log_error("Failed to clean-up device with no table: %s %u:%u",
+ dnode->name ? dnode->name : "",
+ dnode->info.major, dnode->info.minor);
+ return 0;
+ }
+ return 1;
+}
static int _build_dev_string(char *devbuf, size_t bufsize, struct dm_tree_node *node)
{
@@ -2662,7 +2697,7 @@ int dm_tree_preload_children(struct dm_tree_node *dnode,
const char *uuid_prefix,
size_t uuid_prefix_len)
{
- int r = 1;
+ int r = 1, node_created = 0;
void *handle = NULL;
struct dm_tree_node *child;
struct dm_info newinfo;
@@ -2684,13 +2719,25 @@ int dm_tree_preload_children(struct dm_tree_node *dnode,
return_0;
/* FIXME Cope if name exists with no uuid? */
- if (!child->info.exists && !_create_node(child))
+ if (!child->info.exists && !(node_created = _create_node(child)))
return_0;
if (!child->info.inactive_table &&
child->props.segment_count &&
- !_load_node(child))
+ !_load_node(child)) {
+ /*
+ * If the table load does not succeed, we remove the
+ * device in the kernel that would otherwise have an
+ * empty table. This makes the create + load of the
+ * device atomic. However, if other dependencies have
+ * already been created and loaded; this code is
+ * insufficient to remove those - only the node
+ * encountering the table load failure is removed.
+ */
+ if (node_created && !_remove_node(child))
+ return_0;
return_0;
+ }
/* Propagate device size change change */
if (child->props.size_changed)
^ permalink raw reply related [flat|nested] 2+ messages in thread
* master - activation: Remove empty DM device when table fails to load.
2014-05-28 15:17 master - activation: Remove empty DM device when table fails to load Jonathan Brassow
@ 2014-05-28 15:45 ` Zdenek Kabelac
0 siblings, 0 replies; 2+ messages in thread
From: Zdenek Kabelac @ 2014-05-28 15:45 UTC (permalink / raw)
To: lvm-devel
Dne 28.5.2014 17:17, Jonathan Brassow napsal(a):
> Gitweb: http://git.fedorahosted.org/git/?p=lvm2.git;a=commitdiff;h=442820aae3648e1846417d1248fa36030eba4bd8
> Commit: 442820aae3648e1846417d1248fa36030eba4bd8
> Parent: 8212dac849e5a820c579c06a1b2805e7934a8d8b
> Author: Jonathan Brassow <jbrassow@redhat.com>
> AuthorDate: Wed May 28 10:17:15 2014 -0500
> Committer: Jonathan Brassow <jbrassow@redhat.com>
> CommitterDate: Wed May 28 10:17:15 2014 -0500
>
> activation: Remove empty DM device when table fails to load.
>
> As part of better error handling, remove DM devices that have been
> sucessfully created but failed to load a table. This can happen
> when pvmove'ing in a cluster and the cluster mirror daemon is not
> running on a remote node - the mapping table failing to load as a
> result. In this case, any revert would work on other nodes running
> cmirrord because the DM devices on those nodes did succeed in loading.
> However, because no table was able to load on the non-cmirrord nodes,
> there is no table present that points to what needs to be reverted.
> This causes the empty DM device to remain on the system without being
> present in any LVM representation.
>
> This patch should only be considered a partial fix to the overall
> problem. This is because only the device which failed to load a
> table is removed. Any LVs that may have been loaded as requirements
> to the DM device that failed to load may be left in place. Complete
> clean-up will require tracking those devices which have been created
> as dependencies and removing them along with the device that failed
> to load a table.
> ---
> libdm/libdm-deptree.c | 59 ++++++++++++++++++++++++++++++++++++++++++++-----
> 1 files changed, 53 insertions(+), 6 deletions(-)
>
> diff --git a/libdm/libdm-deptree.c b/libdm/libdm-deptree.c
> index 0a53755..f34c40f 100644
> --- a/libdm/libdm-deptree.c
> +++ b/libdm/libdm-deptree.c
> @@ -1920,15 +1920,50 @@ static int _create_node(struct dm_tree_node *dnode)
> if (!dm_task_no_open_count(dmt))
> log_error("Failed to disable open_count");
>
> - if ((r = dm_task_run(dmt)))
> - r = dm_task_get_info(dmt, &dnode->info);
> -
> + if ((r = dm_task_run(dmt))) {
> + if (!(r = dm_task_get_info(dmt, &dnode->info)))
> + /*
This is adding just another 'ioctl' on the path - IMHO this should be
removed.
Zdenek
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-05-28 15:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-28 15:17 master - activation: Remove empty DM device when table fails to load Jonathan Brassow
2014-05-28 15:45 ` Zdenek Kabelac
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.