* Re: Snapshot merging [not found] ` <20090831221149.GB15126@redhat.com> @ 2009-09-09 15:47 ` Mikulas Patocka 2009-09-09 15:56 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2009-09-09 15:47 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel Hi I found one bug in the userspace code --- start merging, then from the other console deactivate the origin volume with lvchange -an, then wait next 15 seconds for the next poll interval: the poll process will think that the merging finished, removing the snapshot, leaving partially merged origin and causing data corruption: [slunicko:~]# lvconvert -M vg1/lv2_snap1 Merging of volume lv2_snap1 started. lv2: Merged: 13.1% ( .... deactivate the volume from the other console ....) lv2: Merged: -1.0% Merge into logical volume lv2 finished. Logical volume "snapshot4" successfully removed Maybe this bug was in my code too, I don't remember. Otherwise, the code seems stable, I didn't see anything wrong, I tried to crash it while merging and it resumed without data corruption. Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Snapshot merging 2009-09-09 15:47 ` Snapshot merging Mikulas Patocka @ 2009-09-09 15:56 ` Mike Snitzer 2009-09-09 16:12 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2009-09-09 15:56 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel On Wed, Sep 09 2009 at 11:47am -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > Hi > > I found one bug in the userspace code --- start merging, then from the > other console deactivate the origin volume with lvchange -an, then wait > next 15 seconds for the next poll interval: > > the poll process will think that the merging finished, removing the > snapshot, leaving partially merged origin and causing data corruption: > > [slunicko:~]# lvconvert -M vg1/lv2_snap1 > Merging of volume lv2_snap1 started. > lv2: Merged: 13.1% > ( .... deactivate the volume from the other console ....) > lv2: Merged: -1.0% > Merge into logical volume lv2 finished. > Logical volume "snapshot4" successfully removed > > Maybe this bug was in my code too, I don't remember. Likely, considering I merely ported your LVM2 patches to the latest version. That said, I could've introduced this issue by missing something in the port. Regardless, I'll have a look at preventing the origin LV from being deactivated during a merge. > Otherwise, the code seems stable, I didn't see anything wrong, I tried to > crash it while merging and it resumed without data corruption. Great, thanks for the feedback and testing! Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Snapshot merging 2009-09-09 15:56 ` Mike Snitzer @ 2009-09-09 16:12 ` Mike Snitzer 2009-09-09 17:23 ` Mikulas Patocka 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2009-09-09 16:12 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel On Wed, Sep 09 2009 at 11:56am -0400, Mike Snitzer <snitzer@redhat.com> wrote: > On Wed, Sep 09 2009 at 11:47am -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > Hi > > > > I found one bug in the userspace code --- start merging, then from the > > other console deactivate the origin volume with lvchange -an, then wait > > next 15 seconds for the next poll interval: > > > > the poll process will think that the merging finished, removing the > > snapshot, leaving partially merged origin and causing data corruption: > > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1 > > Merging of volume lv2_snap1 started. > > lv2: Merged: 13.1% > > ( .... deactivate the volume from the other console ....) > > lv2: Merged: -1.0% > > Merge into logical volume lv2 finished. > > Logical volume "snapshot4" successfully removed > > > > Maybe this bug was in my code too, I don't remember. > > Likely, considering I merely ported your LVM2 patches to the latest > version. That said, I could've introduced this issue by missing > something in the port. Regardless, I'll have a look at preventing the > origin LV from being deactivated during a merge. Or more correctly: gracefully halt the merge and deactivate merging snapshot along with the origin. Then on activation the merge should just resume (in the background). Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Snapshot merging 2009-09-09 16:12 ` Mike Snitzer @ 2009-09-09 17:23 ` Mikulas Patocka 2009-09-09 22:06 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2009-09-09 17:23 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel On Wed, 9 Sep 2009, Mike Snitzer wrote: > On Wed, Sep 09 2009 at 11:56am -0400, > Mike Snitzer <snitzer@redhat.com> wrote: > > > On Wed, Sep 09 2009 at 11:47am -0400, > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > Hi > > > > > > I found one bug in the userspace code --- start merging, then from the > > > other console deactivate the origin volume with lvchange -an, then wait > > > next 15 seconds for the next poll interval: > > > > > > the poll process will think that the merging finished, removing the > > > snapshot, leaving partially merged origin and causing data corruption: > > > > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1 > > > Merging of volume lv2_snap1 started. > > > lv2: Merged: 13.1% > > > ( .... deactivate the volume from the other console ....) > > > lv2: Merged: -1.0% > > > Merge into logical volume lv2 finished. > > > Logical volume "snapshot4" successfully removed > > > > > > Maybe this bug was in my code too, I don't remember. > > > > Likely, considering I merely ported your LVM2 patches to the latest > > version. That said, I could've introduced this issue by missing > > something in the port. Regardless, I'll have a look at preventing the > > origin LV from being deactivated during a merge. > > Or more correctly: gracefully halt the merge and deactivate merging > snapshot along with the origin. Then on activation the merge should > just resume (in the background). > > Mike I'd patch _poll_merge_progress to return zero (instead of PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned zero) --- that should terminate the polling process without updating anything. You can try it and test it. Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Snapshot merging 2009-09-09 17:23 ` Mikulas Patocka @ 2009-09-09 22:06 ` Mike Snitzer 0 siblings, 0 replies; 6+ messages in thread From: Mike Snitzer @ 2009-09-09 22:06 UTC (permalink / raw) To: Mikulas Patocka; +Cc: device-mapper development, lvm-devel On Wed, Sep 09 2009 at 1:23pm -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Wed, 9 Sep 2009, Mike Snitzer wrote: > > > On Wed, Sep 09 2009 at 11:56am -0400, > > Mike Snitzer <snitzer@redhat.com> wrote: > > > > > On Wed, Sep 09 2009 at 11:47am -0400, > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > Hi > > > > > > > > I found one bug in the userspace code --- start merging, then from the > > > > other console deactivate the origin volume with lvchange -an, then wait > > > > next 15 seconds for the next poll interval: > > > > > > > > the poll process will think that the merging finished, removing the > > > > snapshot, leaving partially merged origin and causing data corruption: > > > > > > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1 > > > > Merging of volume lv2_snap1 started. > > > > lv2: Merged: 13.1% > > > > ( .... deactivate the volume from the other console ....) > > > > lv2: Merged: -1.0% > > > > Merge into logical volume lv2 finished. > > > > Logical volume "snapshot4" successfully removed > > > > > > > > Maybe this bug was in my code too, I don't remember. > > > > > > Likely, considering I merely ported your LVM2 patches to the latest > > > version. That said, I could've introduced this issue by missing > > > something in the port. Regardless, I'll have a look at preventing the > > > origin LV from being deactivated during a merge. > > > > Or more correctly: gracefully halt the merge and deactivate merging > > snapshot along with the origin. Then on activation the merge should > > just resume (in the background). > > > > Mike > > I'd patch _poll_merge_progress to return zero (instead of > PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned > zero) --- that should terminate the polling process without updating > anything. > > You can try it and test it. [cc'ing lvm-devel too] Thanks for the pointer; that was part of the fix. But it wasn't sufficient because lib/activate/dev_manager.c:_percent_run() was not properly returning TARGET_STATUS_ERROR. I've updated the 2 relevant patches with the following inter-diffs; these changes fix things for me. I've uploaded the updated patches as usual, see: http://people.redhat.com/msnitzer/patches/snapshot-merge/lvm2/LVM2-2.02.52-cvs/ updated lvm-report-state-in-return-value-of-target_percent.patch contains: diff --git a/lib/activate/activate.c b/lib/activate/activate.c index aa75e5b..a55bf2f 100644 --- a/lib/activate/activate.c +++ b/lib/activate/activate.c @@ -493,7 +493,8 @@ int lv_info_by_lvid(struct cmd_context *cmd, const char *lvid_s, } /* - * Returns 1 if percent set, else 0 on failure. + * Returns TARGET_STATUS_PROCESSING or TARGET_STATUS_FINISHED if percent set; + * else TARGET_STATUS_ERROR or TARGET_STATUS_INVALIDATED on failure. */ int lv_snapshot_percent(const struct logical_volume *lv, float *percent) { diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c index 6495b8c..d701c71 100644 --- a/lib/activate/dev_manager.c +++ b/lib/activate/dev_manager.c @@ -333,7 +333,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, struct logical_volume *lv, float *percent, uint32_t *event_nr) { - int r = TARGET_STATUS_FINISHED; + int r = TARGET_STATUS_ERROR; struct dm_task *dmt; struct dm_info info; void *next = NULL; @@ -364,6 +364,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, if (event_nr) *event_nr = info.event_nr; + r = TARGET_STATUS_FINISHED; do { next = dm_get_next_target(dmt, next, &start, &length, &type, ¶ms); @@ -396,6 +397,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, } while (next); if (lv && (segh = dm_list_next(&lv->segments, segh))) { + r = TARGET_STATUS_ERROR; log_error("Number of segments in active LV %s does not " "match metadata", lv->name); goto out; updated lvm-merge-background-poll.patch contains: diff --git a/tools/lvconvert.c b/tools/lvconvert.c index d5afca4..3b46e11 100644 --- a/tools/lvconvert.c +++ b/tools/lvconvert.c @@ -360,15 +360,18 @@ static int _poll_merge_progress(struct cmd_context *cmd, { int r; float percent; - if (!(r = lv_snapshot_percent(lv, &percent))) { - log_error("Failed to query for merging percentage of lv %s.", lv->name); + + r = lv_snapshot_percent(lv, &percent); + if (r == TARGET_STATUS_ERROR) { + log_error("Failed query for merging percentage of lv %s.", lv->name); /* returning zero terminates the polling process */ return 0; } - if (r == TARGET_STATUS_INVALIDATED) { + else if (r == TARGET_STATUS_INVALIDATED) { log_print("%s: Snapshot invalidated, aborting merging", lv->name); return PROGRESS_FINISHED; } + if (parms->progress_display) log_print("%s: %s: %.1f%%", lv->name, parms->progress_title, percent); else -- lvm-devel mailing list lvm-devel@redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Snapshot merging @ 2009-09-09 22:06 ` Mike Snitzer 0 siblings, 0 replies; 6+ messages in thread From: Mike Snitzer @ 2009-09-09 22:06 UTC (permalink / raw) To: lvm-devel On Wed, Sep 09 2009 at 1:23pm -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Wed, 9 Sep 2009, Mike Snitzer wrote: > > > On Wed, Sep 09 2009 at 11:56am -0400, > > Mike Snitzer <snitzer@redhat.com> wrote: > > > > > On Wed, Sep 09 2009 at 11:47am -0400, > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > Hi > > > > > > > > I found one bug in the userspace code --- start merging, then from the > > > > other console deactivate the origin volume with lvchange -an, then wait > > > > next 15 seconds for the next poll interval: > > > > > > > > the poll process will think that the merging finished, removing the > > > > snapshot, leaving partially merged origin and causing data corruption: > > > > > > > > [slunicko:~]# lvconvert -M vg1/lv2_snap1 > > > > Merging of volume lv2_snap1 started. > > > > lv2: Merged: 13.1% > > > > ( .... deactivate the volume from the other console ....) > > > > lv2: Merged: -1.0% > > > > Merge into logical volume lv2 finished. > > > > Logical volume "snapshot4" successfully removed > > > > > > > > Maybe this bug was in my code too, I don't remember. > > > > > > Likely, considering I merely ported your LVM2 patches to the latest > > > version. That said, I could've introduced this issue by missing > > > something in the port. Regardless, I'll have a look at preventing the > > > origin LV from being deactivated during a merge. > > > > Or more correctly: gracefully halt the merge and deactivate merging > > snapshot along with the origin. Then on activation the merge should > > just resume (in the background). > > > > Mike > > I'd patch _poll_merge_progress to return zero (instead of > PROGRESS_FINISHED) if querying the lv failed (lv_snapshot_percent returned > zero) --- that should terminate the polling process without updating > anything. > > You can try it and test it. [cc'ing lvm-devel too] Thanks for the pointer; that was part of the fix. But it wasn't sufficient because lib/activate/dev_manager.c:_percent_run() was not properly returning TARGET_STATUS_ERROR. I've updated the 2 relevant patches with the following inter-diffs; these changes fix things for me. I've uploaded the updated patches as usual, see: http://people.redhat.com/msnitzer/patches/snapshot-merge/lvm2/LVM2-2.02.52-cvs/ updated lvm-report-state-in-return-value-of-target_percent.patch contains: diff --git a/lib/activate/activate.c b/lib/activate/activate.c index aa75e5b..a55bf2f 100644 --- a/lib/activate/activate.c +++ b/lib/activate/activate.c @@ -493,7 +493,8 @@ int lv_info_by_lvid(struct cmd_context *cmd, const char *lvid_s, } /* - * Returns 1 if percent set, else 0 on failure. + * Returns TARGET_STATUS_PROCESSING or TARGET_STATUS_FINISHED if percent set; + * else TARGET_STATUS_ERROR or TARGET_STATUS_INVALIDATED on failure. */ int lv_snapshot_percent(const struct logical_volume *lv, float *percent) { diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c index 6495b8c..d701c71 100644 --- a/lib/activate/dev_manager.c +++ b/lib/activate/dev_manager.c @@ -333,7 +333,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, struct logical_volume *lv, float *percent, uint32_t *event_nr) { - int r = TARGET_STATUS_FINISHED; + int r = TARGET_STATUS_ERROR; struct dm_task *dmt; struct dm_info info; void *next = NULL; @@ -364,6 +364,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, if (event_nr) *event_nr = info.event_nr; + r = TARGET_STATUS_FINISHED; do { next = dm_get_next_target(dmt, next, &start, &length, &type, ¶ms); @@ -396,6 +397,7 @@ static int _percent_run(struct dev_manager *dm, const char *name, } while (next); if (lv && (segh = dm_list_next(&lv->segments, segh))) { + r = TARGET_STATUS_ERROR; log_error("Number of segments in active LV %s does not " "match metadata", lv->name); goto out; updated lvm-merge-background-poll.patch contains: diff --git a/tools/lvconvert.c b/tools/lvconvert.c index d5afca4..3b46e11 100644 --- a/tools/lvconvert.c +++ b/tools/lvconvert.c @@ -360,15 +360,18 @@ static int _poll_merge_progress(struct cmd_context *cmd, { int r; float percent; - if (!(r = lv_snapshot_percent(lv, &percent))) { - log_error("Failed to query for merging percentage of lv %s.", lv->name); + + r = lv_snapshot_percent(lv, &percent); + if (r == TARGET_STATUS_ERROR) { + log_error("Failed query for merging percentage of lv %s.", lv->name); /* returning zero terminates the polling process */ return 0; } - if (r == TARGET_STATUS_INVALIDATED) { + else if (r == TARGET_STATUS_INVALIDATED) { log_print("%s: Snapshot invalidated, aborting merging", lv->name); return PROGRESS_FINISHED; } + if (parms->progress_display) log_print("%s: %s: %.1f%%", lv->name, parms->progress_title, percent); else ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-09-09 22:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4A97A035.9010604@redhat.com>
[not found] ` <Pine.LNX.4.64.0908311740160.16318@hs20-bc2-1.build.redhat.com>
[not found] ` <20090831221149.GB15126@redhat.com>
2009-09-09 15:47 ` Snapshot merging Mikulas Patocka
2009-09-09 15:56 ` Mike Snitzer
2009-09-09 16:12 ` Mike Snitzer
2009-09-09 17:23 ` Mikulas Patocka
2009-09-09 22:06 ` Mike Snitzer
2009-09-09 22:06 ` Mike Snitzer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.