From: Max Reitz <mreitz@redhat.com>
To: John Snow <jsnow@redhat.com>, qemu-block@nongnu.org
Cc: kwolf@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org,
stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 01/13] blockjob: record time of last entrance
Date: Wed, 7 Feb 2018 23:05:40 +0100 [thread overview]
Message-ID: <24fc02c2-cf2a-4a62-8828-80cb14a8ecdc@redhat.com> (raw)
In-Reply-To: <20180119205847.7141-2-jsnow@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 7339 bytes --]
On 2018-01-19 21:58, John Snow wrote:
> The mirror job makes a semi-inaccurate record of the last time we yielded
> by recording the last time we left a "pause", but this doesn't always
> correlate to the time we actually last successfully ceded control.
>
> Record the time we last *exited* a yield centrally. In other words, record
> the time we began execution of this job to know how long we have been
> selfish for.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
> block/mirror.c | 8 ++------
> blockjob.c | 2 ++
> include/block/blockjob.h | 5 +++++
> 3 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/block/mirror.c b/block/mirror.c
> index c9badc1203..88f4e8964d 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -63,7 +63,6 @@ typedef struct MirrorBlockJob {
> QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
> int buf_free_count;
>
> - uint64_t last_pause_ns;
> unsigned long *in_flight_bitmap;
> int in_flight;
> int64_t bytes_in_flight;
> @@ -596,8 +595,7 @@ static void mirror_throttle(MirrorBlockJob *s)
> {
> int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>
> - if (now - s->last_pause_ns > SLICE_TIME) {
> - s->last_pause_ns = now;
> + if (now - s->common.last_enter_ns > SLICE_TIME) {
> block_job_sleep_ns(&s->common, 0);
> } else {
> block_job_pause_point(&s->common);
> @@ -769,7 +767,6 @@ static void coroutine_fn mirror_run(void *opaque)
>
> mirror_free_init(s);
>
> - s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> if (!s->is_none_mode) {
> ret = mirror_dirty_init(s);
> if (ret < 0 || block_job_is_cancelled(&s->common)) {
> @@ -803,7 +800,7 @@ static void coroutine_fn mirror_run(void *opaque)
> * We do so every SLICE_TIME nanoseconds, or when there is an error,
> * or when the source is clean, whichever comes first.
> */
> - delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->last_pause_ns;
> + delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->common.last_enter_ns;
The horror.
> if (delta < SLICE_TIME &&
> s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
> if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
> @@ -878,7 +875,6 @@ static void coroutine_fn mirror_run(void *opaque)
> delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
> block_job_sleep_ns(&s->common, delay_ns);
> }
> - s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> }
Hmmm. So you're now relying on block_job_sleep_ns() updating
last_enter_ns. But it will only do so if we actually do sleep, which we
will not if the block job has been cancelled.
Then, last_enter_ns will stay at its current value indefinitely because
neither block_job_sleep_ns() nor block_job_pause_point() will yield.
OK, so at this point we should leave the mirror loop. There are three
points where this is done:
(1) If s->ret < 0. OK, let's hope it isn't.
(2) If cnt == 0 && should_complete. We'll come back to that.
(3) If !s->synced && block_job_is_cancelled(...). So basically now, if
the job has not emitted a READY event yet.
But if we have emitted the READY, we have to wait for cnt == 0 &&
should_complete (note that should_complete in turn will only be set if
s->in_flight == 0 && cnt == 0). But unless delta < SLICE_TIME, we will
never do another mirror_iteration(), so unless we have already started
the necessary requests, they will never be started and we will loop forever.
So try this on master (I prefixed the QMP lines with in/out depending on
the direction -- note that it's important to have the block-job-cancel
on the same line the human-monitor-command finishes on so they are
executed right after each other!):
$ ./qemu-img create -f qcow2 foo.qcow2 64M
$ x86_64-softmmu/qemu-system-x86_64 -qmp stdio
In: {"QMP": {"version": {"qemu": {"micro": 50, "minor": 11, "major": 2},
"package": " (v2.9.0-632-g4a52d43-dirty)"}, "capabilities": []}}
Out: {"execute":"qmp_capabilities"}
In: {"return": {}}
Out: {"execute":"object-add","arguments":
{"qom-type":"throttle-group","id":"tg0","props":{"limits":
{"bps-total":16777216}}}}
In: {"return": {}}
Out: {"execute":"blockdev-add","arguments":
{"node-name":"source","driver":"qcow2","file":
{"driver":"file","filename":"foo.qcow2"}}}
In: {"return": {}}
Out: {"execute":"blockdev-add","arguments":
{"node-name":"target","driver":"throttle",
"throttle-group":"tg0","file":
{"driver":"null-co","size":67108864}}}
In: {"return": {}}
Out: {"execute":"blockdev-mirror","arguments":
{"job-id":"mirror","device":"source","target":"target",
"sync":"full"}}
In: {"return": {}}
In: {"timestamp": {"seconds": 1518040566, "microseconds": 658111},
"event": "BLOCK_JOB_READY", "data":
{"device": "mirror", "len": 67108864, "offset": 67108864,
"speed": 0, "type": "mirror"}}
Out: {"execute":"human-monitor-command","arguments":
{"command-line":"qemu-io source \"write 0 64M\""}
}{"execute":"block-job-cancel","arguments":{"device":"mirror"}}
In: wrote 67108864/67108864 bytes at offset 0
In: 64 MiB, 1 ops; 0.0310 sec (2.011 GiB/sec and 32.1823 ops/sec)
In: {"return": ""}
In: {"return": {}}
In: {"timestamp": {"seconds": 1518040578, "microseconds": 626669},
"event": "BLOCK_JOB_COMPLETED", "data":
{"device": "mirror", "len": 134217728, "offset": 134217728,
"speed": 0, "type": "mirror"}}
But with your patch, the BLOCK_JOB_COMPLETED never appears (it should
take four seconds).
>
> immediate_exit:
> diff --git a/blockjob.c b/blockjob.c
> index f5cea84e73..2a9ff66b95 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -321,6 +321,7 @@ void block_job_start(BlockJob *job)
> job->pause_count--;
> job->busy = true;
> job->paused = false;
> + job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> bdrv_coroutine_enter(blk_bs(job->blk), job->co);
> }
>
> @@ -786,6 +787,7 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns)
> job->busy = false;
> block_job_unlock();
> qemu_coroutine_yield();
> + job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>
> /* Set by block_job_enter before re-entering the coroutine. */
> assert(job->busy);
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 00403d9482..e965845c94 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -141,6 +141,11 @@ typedef struct BlockJob {
> */
> QEMUTimer sleep_timer;
>
> + /**
> + * Timestamp of the last yield
A yield has two timestamps, one pre-yield, one post-yield. Maybe it
should be noted that this records the post-yield one because that's not
what I'd assume just based on the comment, but OTOH the name of the
variable should be enough to clear that up.
Max
> + */
> + uint64_t last_enter_ns;
> +
> /** Non-NULL if this job is part of a transaction */
> BlockJobTxn *txn;
> QLIST_ENTRY(BlockJob) txn_list;
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
next prev parent reply other threads:[~2018-02-07 22:06 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-19 20:58 [Qemu-devel] [PATCH v2 00/13] blockjob: refactor mirror_throttle John Snow
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 01/13] blockjob: record time of last entrance John Snow
2018-02-07 22:05 ` Max Reitz [this message]
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 02/13] blockjob: consolidate SLICE_TIME definition John Snow
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 03/13] blockjob: create block_job_relax John Snow
2018-02-07 22:12 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 04/13] blockjob: allow block_job_throttle to take delay_ns John Snow
2018-02-07 22:17 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 05/13] block/commit: use block_job_relax John Snow
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 06/13] block/stream: " John Snow
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 07/13] block/backup: " John Snow
2018-02-07 22:19 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 08/13] allow block_job_relax to return -ECANCELED John Snow
2018-02-07 22:27 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 09/13] block/backup: remove yield_and_check John Snow
2018-02-07 22:33 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 10/13] block/mirror: condense cancellation and relax calls John Snow
2018-02-07 22:36 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 11/13] block/mirror: remove block_job_sleep_ns calls John Snow
2018-02-07 22:46 ` Max Reitz
2018-02-07 22:56 ` Max Reitz
2018-02-09 23:13 ` John Snow
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 12/13] blockjob: privatize block_job_sleep_ns John Snow
2018-02-07 22:47 ` Max Reitz
2018-01-19 20:58 ` [Qemu-devel] [PATCH v2 13/13] blockjob: remove block_job_pause_point from interface John Snow
2018-02-07 22:50 ` Max Reitz
2018-01-19 21:14 ` [Qemu-devel] [PATCH v2 00/13] blockjob: refactor mirror_throttle no-reply
2018-01-30 20:25 ` John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=24fc02c2-cf2a-4a62-8828-80cb14a8ecdc@redhat.com \
--to=mreitz@redhat.com \
--cc=jcody@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).