All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix mirror corruption during primary device failure.
@ 2008-09-04 14:40 Jonathan Brassow
  0 siblings, 0 replies; only message in thread
From: Jonathan Brassow @ 2008-09-04 14:40 UTC (permalink / raw)
  To: lvm-devel

 brassow

When down converting mirrors (e.g. going from a 3-leg to 2-leg mirror),
removable legs are pushed to the end of the array via swapping with the
last element.

Example:
- Mirror consists of devices A, B, C; and we wish to remove A
- A is first swapped with C, leaving C, B, A
- The leg count is reduced and A is removed, leaving C, B

The above works fine in most cases.  However, if there is a failure
of the primary device (the first device), the kernel selects the next
leg as the primary and continues.  While there is a failed device,
the kernel will only write to the primary.

Revisiting the above example:
- Mirror consists of devices A, B, C
- A fails, leaving a, B, C
- kernel selects B as the new primary.
- Performing the above conversion will cause a 2-way mirror to
  be put in place with THE WRONG PRIMARY, C.

The scenario causes all writes performed between the time of failure
and the conversion to be lost - causing corruption of file systems
and loss of data.

This patch preserves the ordering of devices when moving 'removable
pvs' to the end of the array.  So, rather than having:
	1) A, B, C (starting mirror)
	2) C, B, A (reordering legs)
	3) C, B	   (converted mirror)
we have:
        1) A, B, C (starting mirror)
        2) B, C, A (reordering legs)
        3) B, C    (converted mirror)

Index: LVM2-rhel5/lib/metadata/mirror.c
===================================================================
--- LVM2-rhel5.orig/lib/metadata/mirror.c
+++ LVM2-rhel5/lib/metadata/mirror.c
@@ -136,6 +136,53 @@ uint32_t adjusted_mirror_region_size(uin
 }
 
 /*
+ * shift_mirror_legs
+ * @mirrored_seg
+ * @leg_pos:  The position (index) of the leg to move to the end
+ *
+ * When dealing with removal of legs, we often move a 'removable leg'
+ * to the back of the 'areas' array.  It is critically important not
+ * to simply swap it for the last area in the array.  This would have
+ * the affect of reordering the remaining legs - altering position of
+ * the primary.  So, we must shuffle all of the areas in the array
+ * to maintain their relative position before moving the 'removable
+ * leg' to the end.
+ *
+ * Short illustration of the problem:
+ *   - Mirror consists of legs A, B, C and we want to remove A
+ *   - We swap A and C and then remove A, leaving C, B
+ * This scenario is problematic in failure cases where A dies, because
+ * B becomes the primary.  If the above happens, we effectively throw
+ * away any changes made between the time of failure and the time of
+ * restructuring the mirror.
+ *
+ * So, any time we want to move areas to the end to be removed, use
+ * this function.
+ *
+ * Returns: 0 on success, 1 on failure
+ */
+static int shift_mirror_legs(struct lv_segment *mirrored_seg, int leg_pos)
+{
+	int i;
+	struct lv_segment_area area;
+
+
+	if (leg_pos >= mirrored_seg->area_count)
+		return 1; /* -EINVAL */
+
+	area = mirrored_seg->areas[leg_pos];
+
+	/* Shift everyone down to fill the hole */
+	for (i = leg_pos+1; i < mirrored_seg->area_count; i++)
+		mirrored_seg->areas[i-1] = mirrored_seg->areas[i];
+
+	/* Stick this one at the end */
+	mirrored_seg->areas[i-1] = area;
+
+	return 0;
+}
+
+/*
  * This function writes a new header to the mirror log header to the lv
  *
  * Returns: 1 on success, 0 on failure
@@ -469,13 +516,12 @@ static int _remove_mirror_images(struct 
 		for (s = 0; s < mirrored_seg->area_count &&
 			    old_area_count - new_area_count < num_removed; s++) {
 			sub_lv = seg_lv(mirrored_seg, s);
+
 			if (!is_temporary_mirror_layer(sub_lv) &&
 			    _is_mirror_image_removable(sub_lv, removable_pvs)) {
-				/* Swap segment to end */
+				if (shift_mirror_legs(mirrored_seg, s))
+					return 0;
 				new_area_count--;
-				area = mirrored_seg->areas[new_area_count];
-				mirrored_seg->areas[new_area_count] = mirrored_seg->areas[s];
-				mirrored_seg->areas[s] = area;
 			}
 		}
 		if (num_removed && old_area_count == new_area_count)




^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2008-09-04 14:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-04 14:40 [PATCH] Fix mirror corruption during primary device failure Jonathan Brassow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.