[PATCH v2 0/3] block-jobs: add final flush

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/3] block-jobs: add final flush
@ 2024-06-26 14:50 Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 1/3] block/commit: implement " Vladimir Sementsov-Ogievskiy
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-06-26 14:50 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, vsementsov, jsnow, den, f.ebner

Actually block job is not completed without the final flush. It's
rather unexpected to have broken target when job was successfully
completed long ago and now we fail to flush or process just
crashed/killed.

Mirror job already has mirror_flush() for this. So, it's OK.

Add similar things for other jobs: backup, stream, commit.

v2: rework stream and commit, also split into 3 commits.

Vladimir Sementsov-Ogievskiy (3):
  block/commit: implement final flush
  block/stream: implement final flush
  block/backup: implement final flush

 block/backup.c             |  2 +-
 block/block-copy.c         |  7 ++++
 block/commit.c             | 41 +++++++++++++++--------
 block/stream.c             | 67 +++++++++++++++++++++++---------------
 include/block/block-copy.h |  1 +
 5 files changed, 77 insertions(+), 41 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] block/commit: implement final flush
  2024-06-26 14:50 [PATCH v2 0/3] block-jobs: add final flush Vladimir Sementsov-Ogievskiy
@ 2024-06-26 14:50 ` Vladimir Sementsov-Ogievskiy
  2024-07-18 19:22   ` Kevin Wolf
  2024-06-26 14:50 ` [PATCH v2 2/3] block/stream: " Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 3/3] block/backup: " Vladimir Sementsov-Ogievskiy
  2 siblings, 1 reply; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-06-26 14:50 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, vsementsov, jsnow, den, f.ebner

Actually block job is not completed without the final flush. It's
rather unexpected to have broken target when job was successfully
completed long ago and now we fail to flush or process just
crashed/killed.

Mirror job already has mirror_flush() for this. So, it's OK.

Do this for commit job too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 block/commit.c | 41 +++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/block/commit.c b/block/commit.c
index 7c3fdcb0ca..81971692a2 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -134,6 +134,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
     int64_t n = 0; /* bytes */
     QEMU_AUTO_VFREE void *buf = NULL;
     int64_t len, base_len;
+    bool need_final_flush = true;
 
     len = blk_co_getlength(s->top);
     if (len < 0) {
@@ -155,8 +156,8 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
 
     buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
 
-    for (offset = 0; offset < len; offset += n) {
-        bool copy;
+    for (offset = 0; offset < len || need_final_flush; offset += n) {
+        bool copy = false;
         bool error_in_source = true;
 
         /* Note that even when no rate limit is applied we need to yield
@@ -166,22 +167,34 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
         if (job_is_cancelled(&s->common.job)) {
             break;
         }
-        /* Copy if allocated above the base */
-        ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
-                                        offset, COMMIT_BUFFER_SIZE, &n);
-        copy = (ret > 0);
-        trace_commit_one_iteration(s, offset, n, ret);
-        if (copy) {
-            assert(n < SIZE_MAX);
 
-            ret = blk_co_pread(s->top, offset, n, buf, 0);
-            if (ret >= 0) {
-                ret = blk_co_pwrite(s->base, offset, n, buf, 0);
-                if (ret < 0) {
-                    error_in_source = false;
+        if (offset < len) {
+            /* Copy if allocated above the base */
+            ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
+                                            offset, COMMIT_BUFFER_SIZE, &n);
+            copy = (ret > 0);
+            trace_commit_one_iteration(s, offset, n, ret);
+            if (copy) {
+                assert(n < SIZE_MAX);
+
+                ret = blk_co_pread(s->top, offset, n, buf, 0);
+                if (ret >= 0) {
+                    ret = blk_co_pwrite(s->base, offset, n, buf, 0);
+                    if (ret < 0) {
+                        error_in_source = false;
+                    }
                 }
             }
+        } else {
+            assert(need_final_flush);
+            ret = blk_co_flush(s->base);
+            if (ret < 0) {
+                error_in_source = false;
+            } else {
+                need_final_flush = false;
+            }
         }
+
         if (ret < 0) {
             BlockErrorAction action =
                 block_job_error_action(&s->common, s->on_error,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] block/stream: implement final flush
  2024-06-26 14:50 [PATCH v2 0/3] block-jobs: add final flush Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 1/3] block/commit: implement " Vladimir Sementsov-Ogievskiy
@ 2024-06-26 14:50 ` Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 3/3] block/backup: " Vladimir Sementsov-Ogievskiy
  2 siblings, 0 replies; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-06-26 14:50 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, vsementsov, jsnow, den, f.ebner

Actually block job is not completed without the final flush. It's
rather unexpected to have broken target when job was successfully
completed long ago and now we fail to flush or process just
crashed/killed.

Mirror job already has mirror_flush() for this. So, it's OK.

Do this for stream job too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 block/stream.c | 67 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 7031eef12b..893db258d4 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -160,6 +160,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     int64_t offset = 0;
     int error = 0;
     int64_t n = 0; /* bytes */
+    bool need_final_flush = true;
 
     WITH_GRAPH_RDLOCK_GUARD() {
         unfiltered_bs = bdrv_skip_filters(s->target_bs);
@@ -175,10 +176,13 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     }
     job_progress_set_remaining(&s->common.job, len);
 
-    for ( ; offset < len; offset += n) {
-        bool copy;
+    for ( ; offset < len || need_final_flush; offset += n) {
+        bool copy = false;
+        bool error_is_read = true;
         int ret;
 
+        n = 0;
+
         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that bdrv_drain_all() returns.
          */
@@ -187,35 +191,46 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
             break;
         }
 
-        copy = false;
-
-        WITH_GRAPH_RDLOCK_GUARD() {
-            ret = bdrv_co_is_allocated(unfiltered_bs, offset, STREAM_CHUNK, &n);
-            if (ret == 1) {
-                /* Allocated in the top, no need to copy.  */
-            } else if (ret >= 0) {
-                /*
-                 * Copy if allocated in the intermediate images.  Limit to the
-                 * known-unallocated area [offset, offset+n*BDRV_SECTOR_SIZE).
-                 */
-                ret = bdrv_co_is_allocated_above(bdrv_cow_bs(unfiltered_bs),
-                                                 s->base_overlay, true,
-                                                 offset, n, &n);
-                /* Finish early if end of backing file has been reached */
-                if (ret == 0 && n == 0) {
-                    n = len - offset;
+        if (offset < len) {
+            WITH_GRAPH_RDLOCK_GUARD() {
+                ret = bdrv_co_is_allocated(unfiltered_bs, offset, STREAM_CHUNK,
+                                           &n);
+                if (ret == 1) {
+                    /* Allocated in the top, no need to copy.  */
+                } else if (ret >= 0) {
+                    /*
+                     * Copy if allocated in the intermediate images.  Limit to
+                     * the known-unallocated area
+                     * [offset, offset+n*BDRV_SECTOR_SIZE).
+                     */
+                    ret = bdrv_co_is_allocated_above(bdrv_cow_bs(unfiltered_bs),
+                                                     s->base_overlay, true,
+                                                     offset, n, &n);
+                    /* Finish early if end of backing file has been reached */
+                    if (ret == 0 && n == 0) {
+                        n = len - offset;
+                    }
+
+                    copy = (ret > 0);
                 }
-
-                copy = (ret > 0);
             }
-        }
-        trace_stream_one_iteration(s, offset, n, ret);
-        if (copy) {
-            ret = stream_populate(s->blk, offset, n);
+            trace_stream_one_iteration(s, offset, n, ret);
+            if (copy) {
+                ret = stream_populate(s->blk, offset, n);
+            }
+        } else {
+            assert(need_final_flush);
+            ret = blk_co_flush(s->blk);
+            if (ret < 0) {
+                error_is_read = false;
+            } else {
+                need_final_flush = false;
+            }
         }
         if (ret < 0) {
             BlockErrorAction action =
-                block_job_error_action(&s->common, s->on_error, true, -ret);
+                block_job_error_action(&s->common, s->on_error,
+                                       error_is_read, -ret);
             if (action == BLOCK_ERROR_ACTION_STOP) {
                 n = 0;
                 continue;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] block/backup: implement final flush
  2024-06-26 14:50 [PATCH v2 0/3] block-jobs: add final flush Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 1/3] block/commit: implement " Vladimir Sementsov-Ogievskiy
  2024-06-26 14:50 ` [PATCH v2 2/3] block/stream: " Vladimir Sementsov-Ogievskiy
@ 2024-06-26 14:50 ` Vladimir Sementsov-Ogievskiy
  2 siblings, 0 replies; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-06-26 14:50 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, vsementsov, jsnow, den, f.ebner

Actually block job is not completed without the final flush. It's
rather unexpected to have broken target when job was successfully
completed long ago and now we fail to flush or process just
crashed/killed.

Mirror job already has mirror_flush() for this. So, it's OK.

Do this for backup job too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 block/backup.c             | 2 +-
 block/block-copy.c         | 7 +++++++
 include/block/block-copy.h | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 3dd2e229d2..fee78ba5ad 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -156,7 +156,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
         job->bg_bcs_call = s = block_copy_async(job->bcs, 0,
                 QEMU_ALIGN_UP(job->len, job->cluster_size),
                 job->perf.max_workers, job->perf.max_chunk,
-                backup_block_copy_callback, job);
+                true, backup_block_copy_callback, job);
 
         while (!block_copy_call_finished(s) &&
                !job_is_cancelled(&job->common.job))
diff --git a/block/block-copy.c b/block/block-copy.c
index 7e3b378528..842b0383db 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -54,6 +54,7 @@ typedef struct BlockCopyCallState {
     int max_workers;
     int64_t max_chunk;
     bool ignore_ratelimit;
+    bool need_final_flush;
     BlockCopyAsyncCallbackFunc cb;
     void *cb_opaque;
     /* Coroutine where async block-copy is running */
@@ -899,6 +900,10 @@ block_copy_common(BlockCopyCallState *call_state)
          */
     } while (ret > 0 && !qatomic_read(&call_state->cancelled));
 
+    if (ret == 0 && call_state->need_final_flush) {
+        ret = bdrv_co_flush(s->target->bs);
+    }
+
     qatomic_store_release(&call_state->finished, true);
 
     if (call_state->cb) {
@@ -954,6 +959,7 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
 BlockCopyCallState *block_copy_async(BlockCopyState *s,
                                      int64_t offset, int64_t bytes,
                                      int max_workers, int64_t max_chunk,
+                                     bool need_final_flush,
                                      BlockCopyAsyncCallbackFunc cb,
                                      void *cb_opaque)
 {
@@ -965,6 +971,7 @@ BlockCopyCallState *block_copy_async(BlockCopyState *s,
         .bytes = bytes,
         .max_workers = max_workers,
         .max_chunk = max_chunk,
+        .need_final_flush = need_final_flush,
         .cb = cb,
         .cb_opaque = cb_opaque,
 
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index bdc703bacd..6588ebaf77 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -62,6 +62,7 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
 BlockCopyCallState *block_copy_async(BlockCopyState *s,
                                      int64_t offset, int64_t bytes,
                                      int max_workers, int64_t max_chunk,
+                                     bool need_final_flush,
                                      BlockCopyAsyncCallbackFunc cb,
                                      void *cb_opaque);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] block/commit: implement final flush
  2024-06-26 14:50 ` [PATCH v2 1/3] block/commit: implement " Vladimir Sementsov-Ogievskiy
@ 2024-07-18 19:22   ` Kevin Wolf
  2024-07-19 10:35     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2024-07-18 19:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, qemu-devel, hreitz, jsnow, den, f.ebner

Am 26.06.2024 um 16:50 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Actually block job is not completed without the final flush. It's
> rather unexpected to have broken target when job was successfully
> completed long ago and now we fail to flush or process just
> crashed/killed.
> 
> Mirror job already has mirror_flush() for this. So, it's OK.
> 
> Do this for commit job too.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
>  block/commit.c | 41 +++++++++++++++++++++++++++--------------
>  1 file changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/block/commit.c b/block/commit.c
> index 7c3fdcb0ca..81971692a2 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -134,6 +134,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>      int64_t n = 0; /* bytes */
>      QEMU_AUTO_VFREE void *buf = NULL;
>      int64_t len, base_len;
> +    bool need_final_flush = true;
>  
>      len = blk_co_getlength(s->top);
>      if (len < 0) {
> @@ -155,8 +156,8 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>  
>      buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
>  
> -    for (offset = 0; offset < len; offset += n) {
> -        bool copy;
> +    for (offset = 0; offset < len || need_final_flush; offset += n) {

In general, the control flow would be nicer to read if the final flush
weren't integrated into the loop, but just added after it.

But I assume this is pretty much required for pausing the job during
error handling in the final flush if you don't want to duplicate a lot
of the logic into a second loop?

> +        bool copy = false;
>          bool error_in_source = true;
>  
>          /* Note that even when no rate limit is applied we need to yield
> @@ -166,22 +167,34 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>          if (job_is_cancelled(&s->common.job)) {
>              break;
>          }
> -        /* Copy if allocated above the base */
> -        ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
> -                                        offset, COMMIT_BUFFER_SIZE, &n);
> -        copy = (ret > 0);
> -        trace_commit_one_iteration(s, offset, n, ret);
> -        if (copy) {
> -            assert(n < SIZE_MAX);
>  
> -            ret = blk_co_pread(s->top, offset, n, buf, 0);
> -            if (ret >= 0) {
> -                ret = blk_co_pwrite(s->base, offset, n, buf, 0);
> -                if (ret < 0) {
> -                    error_in_source = false;
> +        if (offset < len) {
> +            /* Copy if allocated above the base */
> +            ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
> +                                            offset, COMMIT_BUFFER_SIZE, &n);
> +            copy = (ret > 0);
> +            trace_commit_one_iteration(s, offset, n, ret);
> +            if (copy) {
> +                assert(n < SIZE_MAX);
> +
> +                ret = blk_co_pread(s->top, offset, n, buf, 0);
> +                if (ret >= 0) {
> +                    ret = blk_co_pwrite(s->base, offset, n, buf, 0);
> +                    if (ret < 0) {
> +                        error_in_source = false;
> +                    }
>                  }
>              }
> +        } else {
> +            assert(need_final_flush);
> +            ret = blk_co_flush(s->base);
> +            if (ret < 0) {
> +                error_in_source = false;
> +            } else {
> +                need_final_flush = false;
> +            }

Should we set n = 0 in this block to avoid counting the last chunk twice
for the progress?

>          }
> +
>          if (ret < 0) {
>              BlockErrorAction action =
>                  block_job_error_action(&s->common, s->on_error,

Kevin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] block/commit: implement final flush
  2024-07-18 19:22   ` Kevin Wolf
@ 2024-07-19 10:35     ` Vladimir Sementsov-Ogievskiy
  2024-07-29 12:25       ` Kevin Wolf
  0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-07-19 10:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, hreitz, jsnow, den, f.ebner

On 18.07.24 22:22, Kevin Wolf wrote:
> Am 26.06.2024 um 16:50 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Actually block job is not completed without the final flush. It's
>> rather unexpected to have broken target when job was successfully
>> completed long ago and now we fail to flush or process just
>> crashed/killed.
>>
>> Mirror job already has mirror_flush() for this. So, it's OK.
>>
>> Do this for commit job too.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>>   block/commit.c | 41 +++++++++++++++++++++++++++--------------
>>   1 file changed, 27 insertions(+), 14 deletions(-)
>>
>> diff --git a/block/commit.c b/block/commit.c
>> index 7c3fdcb0ca..81971692a2 100644
>> --- a/block/commit.c
>> +++ b/block/commit.c
>> @@ -134,6 +134,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>>       int64_t n = 0; /* bytes */
>>       QEMU_AUTO_VFREE void *buf = NULL;
>>       int64_t len, base_len;
>> +    bool need_final_flush = true;
>>   
>>       len = blk_co_getlength(s->top);
>>       if (len < 0) {
>> @@ -155,8 +156,8 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>>   
>>       buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
>>   
>> -    for (offset = 0; offset < len; offset += n) {
>> -        bool copy;
>> +    for (offset = 0; offset < len || need_final_flush; offset += n) {
> 
> In general, the control flow would be nicer to read if the final flush
> weren't integrated into the loop, but just added after it.
> 
> But I assume this is pretty much required for pausing the job during
> error handling in the final flush if you don't want to duplicate a lot
> of the logic into a second loop?

Right, that's the reason.

> 
>> +        bool copy = false;
>>           bool error_in_source = true;
>>   
>>           /* Note that even when no rate limit is applied we need to yield
>> @@ -166,22 +167,34 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>>           if (job_is_cancelled(&s->common.job)) {
>>               break;
>>           }
>> -        /* Copy if allocated above the base */
>> -        ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
>> -                                        offset, COMMIT_BUFFER_SIZE, &n);
>> -        copy = (ret > 0);
>> -        trace_commit_one_iteration(s, offset, n, ret);
>> -        if (copy) {
>> -            assert(n < SIZE_MAX);
>>   
>> -            ret = blk_co_pread(s->top, offset, n, buf, 0);
>> -            if (ret >= 0) {
>> -                ret = blk_co_pwrite(s->base, offset, n, buf, 0);
>> -                if (ret < 0) {
>> -                    error_in_source = false;
>> +        if (offset < len) {
>> +            /* Copy if allocated above the base */
>> +            ret = blk_co_is_allocated_above(s->top, s->base_overlay, true,
>> +                                            offset, COMMIT_BUFFER_SIZE, &n);
>> +            copy = (ret > 0);
>> +            trace_commit_one_iteration(s, offset, n, ret);
>> +            if (copy) {
>> +                assert(n < SIZE_MAX);
>> +
>> +                ret = blk_co_pread(s->top, offset, n, buf, 0);
>> +                if (ret >= 0) {
>> +                    ret = blk_co_pwrite(s->base, offset, n, buf, 0);
>> +                    if (ret < 0) {
>> +                        error_in_source = false;
>> +                    }
>>                   }
>>               }
>> +        } else {
>> +            assert(need_final_flush);
>> +            ret = blk_co_flush(s->base);
>> +            if (ret < 0) {
>> +                error_in_source = false;
>> +            } else {
>> +                need_final_flush = false;
>> +            }
> 
> Should we set n = 0 in this block to avoid counting the last chunk twice
> for the progress?

Oops, right. Will fix, thanks

> 
>>           }
>> +
>>           if (ret < 0) {
>>               BlockErrorAction action =
>>                   block_job_error_action(&s->common, s->on_error,
> 
> Kevin
> 

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] block/commit: implement final flush
  2024-07-19 10:35     ` Vladimir Sementsov-Ogievskiy
@ 2024-07-29 12:25       ` Kevin Wolf
  2024-08-02 11:12         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2024-07-29 12:25 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, qemu-devel, hreitz, jsnow, den, f.ebner

Am 19.07.2024 um 12:35 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 18.07.24 22:22, Kevin Wolf wrote:
> > Am 26.06.2024 um 16:50 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Actually block job is not completed without the final flush. It's
> > > rather unexpected to have broken target when job was successfully
> > > completed long ago and now we fail to flush or process just
> > > crashed/killed.
> > > 
> > > Mirror job already has mirror_flush() for this. So, it's OK.
> > > 
> > > Do this for commit job too.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > > ---
> > >   block/commit.c | 41 +++++++++++++++++++++++++++--------------
> > >   1 file changed, 27 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/block/commit.c b/block/commit.c
> > > index 7c3fdcb0ca..81971692a2 100644
> > > --- a/block/commit.c
> > > +++ b/block/commit.c
> > > @@ -134,6 +134,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
> > >       int64_t n = 0; /* bytes */
> > >       QEMU_AUTO_VFREE void *buf = NULL;
> > >       int64_t len, base_len;
> > > +    bool need_final_flush = true;
> > >       len = blk_co_getlength(s->top);
> > >       if (len < 0) {
> > > @@ -155,8 +156,8 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
> > >       buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
> > > -    for (offset = 0; offset < len; offset += n) {
> > > -        bool copy;
> > > +    for (offset = 0; offset < len || need_final_flush; offset += n) {
> > 
> > In general, the control flow would be nicer to read if the final flush
> > weren't integrated into the loop, but just added after it.
> > 
> > But I assume this is pretty much required for pausing the job during
> > error handling in the final flush if you don't want to duplicate a lot
> > of the logic into a second loop?
> 
> Right, that's the reason.

This would probably be the right solution if it affected only commit.
But I've thought a bit more about this and given that the same thing
happens in all of the block jobs, I'm really wondering if introducing a
block job helper function wouldn't be better, so that each block job
could just add something like this after its main loop:

    do {
        ret = blk_co_flush();
    } while (block_job_handle_error(job, ret));

And the helper would call block_job_error_action(), stop the job if
necessary, check if it's cancelled, include a pause point etc.

Kevin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] block/commit: implement final flush
  2024-07-29 12:25       ` Kevin Wolf
@ 2024-08-02 11:12         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 8+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-08-02 11:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, hreitz, jsnow, den, f.ebner

On 29.07.24 15:25, Kevin Wolf wrote:
> Am 19.07.2024 um 12:35 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 18.07.24 22:22, Kevin Wolf wrote:
>>> Am 26.06.2024 um 16:50 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> Actually block job is not completed without the final flush. It's
>>>> rather unexpected to have broken target when job was successfully
>>>> completed long ago and now we fail to flush or process just
>>>> crashed/killed.
>>>>
>>>> Mirror job already has mirror_flush() for this. So, it's OK.
>>>>
>>>> Do this for commit job too.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>>> ---
>>>>    block/commit.c | 41 +++++++++++++++++++++++++++--------------
>>>>    1 file changed, 27 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/block/commit.c b/block/commit.c
>>>> index 7c3fdcb0ca..81971692a2 100644
>>>> --- a/block/commit.c
>>>> +++ b/block/commit.c
>>>> @@ -134,6 +134,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>>>>        int64_t n = 0; /* bytes */
>>>>        QEMU_AUTO_VFREE void *buf = NULL;
>>>>        int64_t len, base_len;
>>>> +    bool need_final_flush = true;
>>>>        len = blk_co_getlength(s->top);
>>>>        if (len < 0) {
>>>> @@ -155,8 +156,8 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
>>>>        buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
>>>> -    for (offset = 0; offset < len; offset += n) {
>>>> -        bool copy;
>>>> +    for (offset = 0; offset < len || need_final_flush; offset += n) {
>>>
>>> In general, the control flow would be nicer to read if the final flush
>>> weren't integrated into the loop, but just added after it.
>>>
>>> But I assume this is pretty much required for pausing the job during
>>> error handling in the final flush if you don't want to duplicate a lot
>>> of the logic into a second loop?
>>
>> Right, that's the reason.
> 
> This would probably be the right solution if it affected only commit.
> But I've thought a bit more about this and given that the same thing
> happens in all of the block jobs, I'm really wondering if introducing a
> block job helper function wouldn't be better, so that each block job
> could just add something like this after its main loop:
> 
>      do {
>          ret = blk_co_flush();
>      } while (block_job_handle_error(job, ret));
> 
> And the helper would call block_job_error_action(), stop the job if
> necessary, check if it's cancelled, include a pause point etc.
> 

Agree. Thanks, I'll try.

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-02 11:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-26 14:50 [PATCH v2 0/3] block-jobs: add final flush Vladimir Sementsov-Ogievskiy
2024-06-26 14:50 ` [PATCH v2 1/3] block/commit: implement " Vladimir Sementsov-Ogievskiy
2024-07-18 19:22   ` Kevin Wolf
2024-07-19 10:35     ` Vladimir Sementsov-Ogievskiy
2024-07-29 12:25       ` Kevin Wolf
2024-08-02 11:12         ` Vladimir Sementsov-Ogievskiy
2024-06-26 14:50 ` [PATCH v2 2/3] block/stream: " Vladimir Sementsov-Ogievskiy
2024-06-26 14:50 ` [PATCH v2 3/3] block/backup: " Vladimir Sementsov-Ogievskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).