From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, John Snow <jsnow@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH for-6.0? 1/3] job: Add job_wait_unpaused() for block-job-complete
Date: Fri, 9 Apr 2021 11:57:38 +0200 [thread overview]
Message-ID: <d8c87c27-f116-c0a4-5557-62fcfc8de44f@redhat.com> (raw)
In-Reply-To: <YHAiCyl6x0kZutXg@merkur.fritz.box>
On 09.04.21 11:44, Kevin Wolf wrote:
> Am 08.04.2021 um 18:55 hat John Snow geschrieben:
>> On 4/8/21 12:20 PM, Max Reitz wrote:
>>> block-job-complete can only be applied when the job is READY, not when
>>> it is on STANDBY (ready, but paused). Draining a job technically pauses
>>> it (which makes a READY job enter STANDBY), and ending the drained
>>> section does not synchronously resume it, but only schedules the job,
>>> which will then be resumed. So attempting to complete a job immediately
>>> after a drained section may sometimes fail.
>>>
>>> That is bad at least because users cannot really work nicely around
>>> this: A job may be paused and resumed at any time, so waiting for the
>>> job to be in the READY state and then issuing a block-job-complete poses
>>> a TOCTTOU problem. The only way around it would be to issue
>>> block-job-complete until it no longer fails due to the job being in the
>>> STANDBY state, but that would not be nice.
>>>
>>> We can solve the problem by allowing block-job-complete to be invoked on
>>> jobs that are on STANDBY, if that status is the result of a drained
>>> section (not because the user has paused the job), and that section has
>>> ended. That is, if the job is on STANDBY, but scheduled to be resumed.
>>>
>>> Perhaps we could actually just directly allow this, seeing that mirror
>>> is the only user of ready/complete, and that mirror_complete() could
>>> probably work under the given circumstances, but there may be many side
>>> effects to consider.
>>>
>>> It is simpler to add a function job_wait_unpaused() that waits for the
>>> job to be resumed (under said circumstances), and to make
>>> qmp_block_job_complete() use it to delay job_complete() until then.
>>>
>>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1945635
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>> include/qemu/job.h | 15 +++++++++++++++
>>> blockdev.c | 3 +++
>>> job.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>> 3 files changed, 60 insertions(+)
>>>
>>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>>> index efc6fa7544..cf3082b6d7 100644
>>> --- a/include/qemu/job.h
>>> +++ b/include/qemu/job.h
>>> @@ -563,4 +563,19 @@ void job_dismiss(Job **job, Error **errp);
>>> */
>>> int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error **errp);
>>> +/**
>>> + * If the job has been paused because of a drained section, and that
>>> + * section has ended, wait until the job is resumed.
>>> + *
>>> + * Return 0 if the job is not paused, or if it has been successfully
>>> + * resumed.
>>> + * Return an error if the job has been paused in such a way that
>>> + * waiting will not resume it, i.e. if it has been paused by the user,
>>> + * or if it is still drained.
>>> + *
>>> + * Callers must be in the home AioContext and hold the AioContext lock
>>> + * of job->aio_context.
>>> + */
>>> +int job_wait_unpaused(Job *job, Error **errp);
>>> +
>>> #endif
>>> diff --git a/blockdev.c b/blockdev.c
>>> index a57590aae4..c0cc2fa364 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -3414,6 +3414,9 @@ void qmp_block_job_complete(const char *device, Error **errp)
>>> return;
>>> }
>>> + if (job_wait_unpaused(&job->job, errp) < 0) {
>>> + return;
>>> + }
>>
>> After which point, we assume we've transitioned back to either RUNNING or
>> READY, and
>>
>>> trace_qmp_block_job_complete(job);
>>> job_complete(&job->job, errp);
>>
>> This function checks the usual state table for permission to deliver/perform
>> the verb.
>>
>>> aio_context_release(aio_context);
>>> diff --git a/job.c b/job.c
>>> index 289edee143..1ea30fd294 100644
>>> --- a/job.c
>>> +++ b/job.c
>>> @@ -1023,3 +1023,45 @@ int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error **errp)
>>> job_unref(job);
>>> return ret;
>>> }
>>> +
>>> +int job_wait_unpaused(Job *job, Error **errp)
>>> +{
>>> + /*
>>> + * Only run this function from the main context, because this is
>>> + * what we need, and this way we do not have to think about what
>>> + * happens if the user concurrently pauses the job from the main
>>> + * monitor.
>>> + */
>>> + assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>>> +
>>> + /*
>>> + * Quick path (e.g. so we do not get an error if pause_count > 0
>>> + * but the job is not even paused)
>>> + */
>>> + if (!job->paused) {
>>> + return 0;
>>> + }
>>> +
>>> + /* If the user has paused the job, waiting will not help */
>>> + if (job->user_paused) {
>>> + error_setg(errp, "Job '%s' has been paused by the user", job->id);
>>> + return -EBUSY;
>>> + }
>>> +
>>
>> Or the job has encountered an error if that error policy is set. It is maybe
>> more accurate to say that the job is currently paused/halted (for some
>> reason) and is awaiting the explicit unpause instruction.
>>
>> "Job '%s' has been paused and needs to be explicitly resumed with
>> job-resume", maybe?
>
> Sounds good to me.
>
>> Job '%s' has been paused and needs to be [explicitly] resumed
>> [by the user] [with job-resume]
>>
>> Some combo of those runes.
>>
>>> + /* Similarly, if the job is still drained, waiting will not help either */
>>> + if (job->pause_count > 0) {
>>> + error_setg(errp, "Job '%s' is blocked and cannot be unpaused", job->id);
>>> + return -EBUSY;
>>> + }
>>> +
>>
>> This leaks an internal state detail out to the caller. In which
>> circumstances does this happen? Do we expect it to?
>>
>> As the user: Why is it blocked? Can I unblock it? Do I wait?
>
> I don't even understand why we need to fail here (which is still an
> unexpected case for management tools, so we would not fix the bug fully,
> but just restrict it to a more unlikely condition).
>
> We know that we're running in a QMP command handler here, so it's not
> the caller that drained the job, but some iothread. The iothread can
> still make progress, so why doesn't waiting help?
Yes, probably the best would be to drop this block.
>>> + /*
>>> + * This function is specifically for waiting for a job to be
>>> + * resumed after a drained section. Ending the drained section
>>> + * includes a job_enter(), which schedules the job loop to be run,
>>> + * and once it does, job->paused will be cleared. Therefore, we
>>> + * do not need to invoke job_enter() here.
>>> + */
>>> + AIO_WAIT_WHILE(job->aio_context, job->paused);
>
> Do we need a aio_wait_kick() in job_pause_point()?
Oh, yes.
>>> + return 0;
>>> +}
>>>
>>
>> Looks about right to me, but you'll want Kevin's look-see for the finer
>> details, of course.
>>
>> My concern is that this adds a wait of an indefinite period to the
>> job_complete command. We mitigate this by checking for some other internal
>> state criteria first, and then by process of elimination deduce that it's
>> safe to wait, as it will (likely) be very quick.
>>
>> Do we open the door for ourselves to get into trouble here, either by a
>> state we are forgetting to rule out (You'd have added it if you know the
>> answer to this) or a hypothetical future change where we forget to update
>> this function?
>>
>> Not necessarily a blocker, I think, and this does solve a real problem
>> fairly inexpensively.
>
> I think the only assumption this function really makes (in particular
> after dropping the job->pause_count check) is that an automatically
> paused job (not counting I/O errors which are treated as user paused)
> will be resumed automatically eventually. Which is an entirely
> reasonable assumption because otherwise the job would be stuck with no
> way to resume it.
>
> The part how long it will block is less clear, but if it isn't very
> short, wouldn't that be a sign that a backend is hanging and we're in
> trouble anyway?
Just as a PS, in a reply to one of Vladimir’s mails
(da048f58-43a6-6811-6ad2-0d7899737a23@redhat.com) I was wondering
whether it even makes sense for mirror to do all the stuff it does in
mirror_complete() to do it there. Aren’t all of those things that
should really be done in job-finalize (i.e. mirror_exit_common())?
Max
next prev parent reply other threads:[~2021-04-09 9:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 16:20 [PATCH for-6.0? 0/3] job: Add job_wait_unpaused() for block-job-complete Max Reitz
2021-04-08 16:20 ` [PATCH for-6.0? 1/3] " Max Reitz
2021-04-08 16:55 ` John Snow
2021-04-09 9:31 ` Max Reitz
2021-04-09 10:17 ` Kevin Wolf
2021-04-09 9:44 ` Kevin Wolf
2021-04-09 9:57 ` Max Reitz [this message]
2021-04-09 16:54 ` John Snow
2021-04-08 16:58 ` Vladimir Sementsov-Ogievskiy
2021-04-08 17:04 ` John Snow
2021-04-08 17:26 ` Vladimir Sementsov-Ogievskiy
2021-04-09 9:51 ` Max Reitz
2021-04-09 10:07 ` Vladimir Sementsov-Ogievskiy
2021-04-09 10:18 ` Max Reitz
2021-04-09 9:38 ` Max Reitz
2021-04-08 16:20 ` [PATCH for-6.0? 2/3] test-blockjob: Test job_wait_unpaused() Max Reitz
2021-04-08 16:20 ` [PATCH for-6.0? 3/3] iotests/041: block-job-complete on user-paused job Max Reitz
2021-04-08 17:09 ` [PATCH for-6.0? 0/3] job: Add job_wait_unpaused() for block-job-complete John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d8c87c27-f116-c0a4-5557-62fcfc8de44f@redhat.com \
--to=mreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).