All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Snapshot merging
       [not found]   ` <20090831221149.GB15126@redhat.com>
@ 2009-09-09 15:47     ` Mikulas Patocka
  2009-09-09 15:56       ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2009-09-09 15:47 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

Hi

I found one bug in the userspace code --- start merging, then from the 
other console deactivate the origin volume with lvchange -an, then wait 
next 15 seconds for the next poll interval:

the poll process will think that the merging finished, removing the 
snapshot, leaving partially merged origin and causing data corruption:

[slunicko:~]# lvconvert -M vg1/lv2_snap1
  Merging of volume lv2_snap1 started.
  lv2: Merged: 13.1%
( .... deactivate the volume from the other console ....)
  lv2: Merged: -1.0%
  Merge into logical volume lv2 finished.
  Logical volume "snapshot4" successfully removed

Maybe this bug was in my code too, I don't remember.

Otherwise, the code seems stable, I didn't see anything wrong, I tried to 
crash it while merging and it resumed without data corruption.

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Snapshot merging
  2009-09-09 15:47     ` Snapshot merging Mikulas Patocka
@ 2009-09-09 15:56       ` Mike Snitzer
  2009-09-09 16:12         ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2009-09-09 15:56 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel

On Wed, Sep 09 2009 at 11:47am -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi
> 
> I found one bug in the userspace code --- start merging, then from the 
> other console deactivate the origin volume with lvchange -an, then wait 
> next 15 seconds for the next poll interval:
> 
> the poll process will think that the merging finished, removing the 
> snapshot, leaving partially merged origin and causing data corruption:
> 
> [slunicko:~]# lvconvert -M vg1/lv2_snap1
>   Merging of volume lv2_snap1 started.
>   lv2: Merged: 13.1%
> ( .... deactivate the volume from the other console ....)
>   lv2: Merged: -1.0%
>   Merge into logical volume lv2 finished.
>   Logical volume "snapshot4" successfully removed
> 
> Maybe this bug was in my code too, I don't remember.

Likely, considering I merely ported your LVM2 patches to the latest
version.  That said, I could've introduced this issue by missing
something in the port.  Regardless, I'll have a look at preventing the
origin LV from being deactivated during a merge.

> Otherwise, the code seems stable, I didn't see anything wrong, I tried to 
> crash it while merging and it resumed without data corruption.

Great, thanks for the feedback and testing!

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Snapshot merging
  2009-09-09 15:56       ` Mike Snitzer
@ 2009-09-09 16:12         ` Mike Snitzer
  2009-09-09 17:23           ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2009-09-09 16:12 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel

On Wed, Sep 09 2009 at 11:56am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Wed, Sep 09 2009 at 11:47am -0400,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > Hi
> > 
> > I found one bug in the userspace code --- start merging, then from the 
> > other console deactivate the origin volume with lvchange -an, then wait 
> > next 15 seconds for the next poll interval:
> > 
> > the poll process will think that the merging finished, removing the 
> > snapshot, leaving partially merged origin and causing data corruption:
> > 
> > [slunicko:~]# lvconvert -M vg1/lv2_snap1
> >   Merging of volume lv2_snap1 started.
> >   lv2: Merged: 13.1%
> > ( .... deactivate the volume from the other console ....)
> >   lv2: Merged: -1.0%
> >   Merge into logical volume lv2 finished.
> >   Logical volume "snapshot4" successfully removed
> > 
> > Maybe this bug was in my code too, I don't remember.
> 
> Likely, considering I merely ported your LVM2 patches to the latest
> version.  That said, I could've introduced this issue by missing
> something in the port.  Regardless, I'll have a look at preventing the
> origin LV from being deactivated during a merge.

Or more correctly: gracefully halt the merge and deactivate merging
snapshot along with the origin.  Then on activation the merge should
just resume (in the background).

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Snapshot merging
  2009-09-09 16:12         ` Mike Snitzer
@ 2009-09-09 17:23           ` Mikulas Patocka
  2009-09-09 22:06               ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2009-09-09 17:23 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel



On Wed, 9 Sep 2009, Mike Snitzer wrote:

> On Wed, Sep 09 2009 at 11:56am -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Wed, Sep 09 2009 at 11:47am -0400,
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > 
> > > Hi
> > > 
> > > I found one bug in the userspace code --- start merging, then from the 
> > > other console deactivate the origin volume with lvchange -an, then wait 
> > > next 15 seconds for the next poll interval:
> > > 
> > > the poll process will think that the merging finished, removing the 
> > > snapshot, leaving partially merged origin and causing data corruption:
> > > 
> > > [slunicko:~]# lvconvert -M vg1/lv2_snap1
> > >   Merging of volume lv2_snap1 started.
> > >   lv2: Merged: 13.1%
> > > ( .... deactivate the volume from the other console ....)
> > >   lv2: Merged: -1.0%
> > >   Merge into logical volume lv2 finished.
> > >   Logical volume "snapshot4" successfully removed
> > > 
> > > Maybe this bug was in my code too, I don't remember.
> > 
> > Likely, considering I merely ported your LVM2 patches to the latest
> > version.  That said, I could've introduced this issue by missing
> > something in the port.  Regardless, I'll have a look at preventing the
> > origin LV from being deactivated during a merge.
> 
> Or more correctly: gracefully halt the merge and deactivate merging
> snapshot along with the origin.  Then on activation the merge should
> just resume (in the background).
> 
> Mike

I'd patch _poll_merge_progress to return zero (instead of 
PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned 
zero) --- that should terminate the polling process without updating 
anything.

You can try it and test it.

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Snapshot merging
  2009-09-09 17:23           ` Mikulas Patocka
@ 2009-09-09 22:06               ` Mike Snitzer
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Snitzer @ 2009-09-09 22:06 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, lvm-devel

On Wed, Sep 09 2009 at  1:23pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Wed, 9 Sep 2009, Mike Snitzer wrote:
> 
> > On Wed, Sep 09 2009 at 11:56am -0400,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> > 
> > > On Wed, Sep 09 2009 at 11:47am -0400,
> > > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > 
> > > > Hi
> > > > 
> > > > I found one bug in the userspace code --- start merging, then from the 
> > > > other console deactivate the origin volume with lvchange -an, then wait 
> > > > next 15 seconds for the next poll interval:
> > > > 
> > > > the poll process will think that the merging finished, removing the 
> > > > snapshot, leaving partially merged origin and causing data corruption:
> > > > 
> > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1
> > > >   Merging of volume lv2_snap1 started.
> > > >   lv2: Merged: 13.1%
> > > > ( .... deactivate the volume from the other console ....)
> > > >   lv2: Merged: -1.0%
> > > >   Merge into logical volume lv2 finished.
> > > >   Logical volume "snapshot4" successfully removed
> > > > 
> > > > Maybe this bug was in my code too, I don't remember.
> > > 
> > > Likely, considering I merely ported your LVM2 patches to the latest
> > > version.  That said, I could've introduced this issue by missing
> > > something in the port.  Regardless, I'll have a look at preventing the
> > > origin LV from being deactivated during a merge.
> > 
> > Or more correctly: gracefully halt the merge and deactivate merging
> > snapshot along with the origin.  Then on activation the merge should
> > just resume (in the background).
> > 
> > Mike
> 
> I'd patch _poll_merge_progress to return zero (instead of 
> PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned 
> zero) --- that should terminate the polling process without updating 
> anything.
> 
> You can try it and test it.

[cc'ing lvm-devel too]

Thanks for the pointer; that was part of the fix.

But it wasn't sufficient because lib/activate/dev_manager.c:_percent_run()
was not properly returning TARGET_STATUS_ERROR.

I've updated the 2 relevant patches with the following inter-diffs;
these changes fix things for me.  I've uploaded the updated patches as
usual, see:
http://people.redhat.com/msnitzer/patches/snapshot-merge/lvm2/LVM2-2.02.52-cvs/

updated lvm-report-state-in-return-value-of-target_percent.patch contains:

diff --git a/lib/activate/activate.c b/lib/activate/activate.c
index aa75e5b..a55bf2f 100644
--- a/lib/activate/activate.c
+++ b/lib/activate/activate.c
@@ -493,7 +493,8 @@ int lv_info_by_lvid(struct cmd_context *cmd, const char *lvid_s,
 }
 
 /*
- * Returns 1 if percent set, else 0 on failure.
+ * Returns TARGET_STATUS_PROCESSING or TARGET_STATUS_FINISHED if percent set;
+ * else TARGET_STATUS_ERROR or TARGET_STATUS_INVALIDATED on failure.
  */
 int lv_snapshot_percent(const struct logical_volume *lv, float *percent)
 {
diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c
index 6495b8c..d701c71 100644
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -333,7 +333,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 			struct logical_volume *lv, float *percent,
 			uint32_t *event_nr)
 {
-	int r = TARGET_STATUS_FINISHED;
+	int r = TARGET_STATUS_ERROR;
 	struct dm_task *dmt;
 	struct dm_info info;
 	void *next = NULL;
@@ -364,6 +364,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 	if (event_nr)
 		*event_nr = info.event_nr;
 
+	r = TARGET_STATUS_FINISHED;
 	do {
 		next = dm_get_next_target(dmt, next, &start, &length, &type,
 					  &params);
@@ -396,6 +397,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 	} while (next);
 
 	if (lv && (segh = dm_list_next(&lv->segments, segh))) {
+		r = TARGET_STATUS_ERROR;
 		log_error("Number of segments in active LV %s does not "
 			  "match metadata", lv->name);
 		goto out;

updated lvm-merge-background-poll.patch contains:

diff --git a/tools/lvconvert.c b/tools/lvconvert.c
index d5afca4..3b46e11 100644
--- a/tools/lvconvert.c
+++ b/tools/lvconvert.c
@@ -360,15 +360,18 @@ static int _poll_merge_progress(struct cmd_context *cmd,
 {
 	int r;
 	float percent;
-	if (!(r = lv_snapshot_percent(lv, &percent))) {
-		log_error("Failed to query for merging percentage of lv %s.", lv->name);
+
+	r = lv_snapshot_percent(lv, &percent);
+	if (r == TARGET_STATUS_ERROR) {
+		log_error("Failed query for merging percentage of lv %s.", lv->name);
 		/* returning zero terminates the polling process */
 		return 0;
 	}
-	if (r == TARGET_STATUS_INVALIDATED) {
+	else if (r == TARGET_STATUS_INVALIDATED) {
 		log_print("%s: Snapshot invalidated, aborting merging", lv->name);
 		return PROGRESS_FINISHED;
 	}
+
 	if (parms->progress_display)
 		log_print("%s: %s: %.1f%%", lv->name, parms->progress_title, percent);
 	else

--
lvm-devel mailing list
lvm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/lvm-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Snapshot merging
@ 2009-09-09 22:06               ` Mike Snitzer
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Snitzer @ 2009-09-09 22:06 UTC (permalink / raw)
  To: lvm-devel

On Wed, Sep 09 2009 at  1:23pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Wed, 9 Sep 2009, Mike Snitzer wrote:
> 
> > On Wed, Sep 09 2009 at 11:56am -0400,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> > 
> > > On Wed, Sep 09 2009 at 11:47am -0400,
> > > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > 
> > > > Hi
> > > > 
> > > > I found one bug in the userspace code --- start merging, then from the 
> > > > other console deactivate the origin volume with lvchange -an, then wait 
> > > > next 15 seconds for the next poll interval:
> > > > 
> > > > the poll process will think that the merging finished, removing the 
> > > > snapshot, leaving partially merged origin and causing data corruption:
> > > > 
> > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1
> > > >   Merging of volume lv2_snap1 started.
> > > >   lv2: Merged: 13.1%
> > > > ( .... deactivate the volume from the other console ....)
> > > >   lv2: Merged: -1.0%
> > > >   Merge into logical volume lv2 finished.
> > > >   Logical volume "snapshot4" successfully removed
> > > > 
> > > > Maybe this bug was in my code too, I don't remember.
> > > 
> > > Likely, considering I merely ported your LVM2 patches to the latest
> > > version.  That said, I could've introduced this issue by missing
> > > something in the port.  Regardless, I'll have a look at preventing the
> > > origin LV from being deactivated during a merge.
> > 
> > Or more correctly: gracefully halt the merge and deactivate merging
> > snapshot along with the origin.  Then on activation the merge should
> > just resume (in the background).
> > 
> > Mike
> 
> I'd patch _poll_merge_progress to return zero (instead of 
> PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned 
> zero) --- that should terminate the polling process without updating 
> anything.
> 
> You can try it and test it.

[cc'ing lvm-devel too]

Thanks for the pointer; that was part of the fix.

But it wasn't sufficient because lib/activate/dev_manager.c:_percent_run()
was not properly returning TARGET_STATUS_ERROR.

I've updated the 2 relevant patches with the following inter-diffs;
these changes fix things for me.  I've uploaded the updated patches as
usual, see:
http://people.redhat.com/msnitzer/patches/snapshot-merge/lvm2/LVM2-2.02.52-cvs/

updated lvm-report-state-in-return-value-of-target_percent.patch contains:

diff --git a/lib/activate/activate.c b/lib/activate/activate.c
index aa75e5b..a55bf2f 100644
--- a/lib/activate/activate.c
+++ b/lib/activate/activate.c
@@ -493,7 +493,8 @@ int lv_info_by_lvid(struct cmd_context *cmd, const char *lvid_s,
 }
 
 /*
- * Returns 1 if percent set, else 0 on failure.
+ * Returns TARGET_STATUS_PROCESSING or TARGET_STATUS_FINISHED if percent set;
+ * else TARGET_STATUS_ERROR or TARGET_STATUS_INVALIDATED on failure.
  */
 int lv_snapshot_percent(const struct logical_volume *lv, float *percent)
 {
diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c
index 6495b8c..d701c71 100644
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -333,7 +333,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 			struct logical_volume *lv, float *percent,
 			uint32_t *event_nr)
 {
-	int r = TARGET_STATUS_FINISHED;
+	int r = TARGET_STATUS_ERROR;
 	struct dm_task *dmt;
 	struct dm_info info;
 	void *next = NULL;
@@ -364,6 +364,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 	if (event_nr)
 		*event_nr = info.event_nr;
 
+	r = TARGET_STATUS_FINISHED;
 	do {
 		next = dm_get_next_target(dmt, next, &start, &length, &type,
 					  &params);
@@ -396,6 +397,7 @@ static int _percent_run(struct dev_manager *dm, const char *name,
 	} while (next);
 
 	if (lv && (segh = dm_list_next(&lv->segments, segh))) {
+		r = TARGET_STATUS_ERROR;
 		log_error("Number of segments in active LV %s does not "
 			  "match metadata", lv->name);
 		goto out;

updated lvm-merge-background-poll.patch contains:

diff --git a/tools/lvconvert.c b/tools/lvconvert.c
index d5afca4..3b46e11 100644
--- a/tools/lvconvert.c
+++ b/tools/lvconvert.c
@@ -360,15 +360,18 @@ static int _poll_merge_progress(struct cmd_context *cmd,
 {
 	int r;
 	float percent;
-	if (!(r = lv_snapshot_percent(lv, &percent))) {
-		log_error("Failed to query for merging percentage of lv %s.", lv->name);
+
+	r = lv_snapshot_percent(lv, &percent);
+	if (r == TARGET_STATUS_ERROR) {
+		log_error("Failed query for merging percentage of lv %s.", lv->name);
 		/* returning zero terminates the polling process */
 		return 0;
 	}
-	if (r == TARGET_STATUS_INVALIDATED) {
+	else if (r == TARGET_STATUS_INVALIDATED) {
 		log_print("%s: Snapshot invalidated, aborting merging", lv->name);
 		return PROGRESS_FINISHED;
 	}
+
 	if (parms->progress_display)
 		log_print("%s: %s: %.1f%%", lv->name, parms->progress_title, percent);
 	else



^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-09-09 22:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4A97A035.9010604@redhat.com>
     [not found] ` <Pine.LNX.4.64.0908311740160.16318@hs20-bc2-1.build.redhat.com>
     [not found]   ` <20090831221149.GB15126@redhat.com>
2009-09-09 15:47     ` Snapshot merging Mikulas Patocka
2009-09-09 15:56       ` Mike Snitzer
2009-09-09 16:12         ` Mike Snitzer
2009-09-09 17:23           ` Mikulas Patocka
2009-09-09 22:06             ` Mike Snitzer
2009-09-09 22:06               ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.