qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: kwolf@redhat.com, pkrempa@redhat.com, jtc@redhat.com,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
Date: Wed, 18 Apr 2018 13:29:45 -0400	[thread overview]
Message-ID: <9a57da6e-b7a3-16e2-c404-b35d29e1b317@redhat.com> (raw)
In-Reply-To: <8736zt2cyt.fsf@dusky.pond.sub.org>



On 04/18/2018 03:25 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> On 04/17/2018 09:44 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> This series seeks to address two distinct but closely related issues
>>>> concerning the job management API.
>>>>
>>>> (1) For jobs that complete when a monitor is not attached and receiving
>>>>     events or notifications, there's no way to discern the job's final
>>>>     return code. Jobs must remain in the query list until dismissed
>>>>     for reliable management.
>>>>
>>>> (2) Jobs that change the block graph structure at an indeterminate point
>>>>     after the job starts compete with the management layer that relies
>>>>     on that graph structure to issue meaningful commands.
>>>>
>>>>     This structure should change only at the behest of the management
>>>>     API, and not asynchronously at unknown points in time. Before a job
>>>>     issues such changes, it must rely on explicit and synchronous
>>>>     confirmation from the management API.
>>>>
>>>> These changes are implemented by formalizing a State Transition Machine
>>>> for the BlockJob subsystem.
>>>>
>>>> Job States:
>>>>
>>>> UNDEFINED       Default state. Internal state only.
>>>> CREATED         Job has been created
>>>> RUNNING         Job has been started and is running
>>>> PAUSED          Job is not ready and has been paused
>>>> READY           Job is ready and is running
>>>> STANDBY         Job is ready and is paused
>>>>
>>>> WAITING         Job is waiting on peers in transaction
>>>> PENDING         Job is waiting on ACK from QMP
>>>> ABORTING        Job is aborting or has been cancelled
>>>> CONCLUDED       Job has finished and has a retcode available
>>>> NULL            Job is being dismantled. Internal state only.
>>>>
>>>> Job Verbs:
>>>>
>>
>> Backporting your quote up here:
>>
>>> For each job verb and job state: what's the new job state?
>>>
>>
>> That's not always 1:1, though I tried to address it in the commit messages.
> 
> Let me rephrase my question then.  For each job verb and job state: what
> are the possible new job states?  If there's more than one, what's the
> condition for each?
> 

Is my answer below not sufficient? Maybe you're asking "Can you write
this up in a formal document" instead, or did I miss explaining something?

> I appreciate commit messages explaining that, but having complete state
> machine documentation in one place (a comment or in docs/) would be
> nice, wouldn't it?
> 
>>>> CANCEL          Instructs a running job to terminate with error,
>>>>                 (Except when that job is READY, which produces no error.)
>>
>> CANCEL will take a job to either NULL... (this is the early abort
>> pathway, prior to the job being fully realized.)
>>
>> ...or to ABORTING (from CREATED once it has fully realized the job, or
>> from RUNNING, READY, WAITING, or PENDING.)
>>
>>>> PAUSE           Request a job to pause.
>>
>> issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.
>>
>>>> RESUME          Request a job to resume from a pause.
>>
>> issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.
>>
>>>> SET-SPEED       Change the speed limiting parameter of a job.
>>
>> No run state change.
>>
>>>> COMPLETE        Ask a READY job to finish and exit.
>>>>
>>
>> Issued to a READY job, transitions to WAITING.
>>
>>>> FINALIZE        Ask a PENDING job to perform its graph finalization.
>>
>> Issued to a PENDING job, transitions to CONCLUDED.
>>
>>>> DISMISS         Finish cleaning up an empty job.
>>>
>>
>> Issued to a CONCLUDED job, transitions to NULL.
>>
>>
>>>> And here's my stab at a diagram:
>>>>
>>>>                  +---------+
>>>>                  |UNDEFINED|
>>>>                  +--+------+
>>>>                     |
>>>>                  +--v----+
>>>>        +---------+CREATED+-----------------+
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>        |         +--+----+     +------+    |
>>>>        +---------+RUNNING<----->PAUSED|    |
>>>>        |         +--+-+--+     +------+    |
>>>>        |            | |                    |
>>>>        |            | +------------------+ |
>>>>        |            |                    | |
>>>>        |         +--v--+       +-------+ | |
>>>>        +---------+READY<------->STANDBY| | |
>>>>        |         +--+--+       +-------+ | |
>>>>        |            |                    | |
>>>>        |         +--v----+               | |
>>>>        +---------+WAITING<---------------+ |
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>        |         +--v----+                 |
>>>>        +---------+PENDING|                 |
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>     +--v-----+   +--v------+               |
>>>>     |ABORTING+--->CONCLUDED|               |
>>>>     +--------+   +--+------+               |
>>>>                     |                      |
>>>>                  +--v-+                    |
>>>>                  |NULL<--------------------+
>>>>                  +----+
>>>
>>> Is this diagram missing a few arrowheads?  E.g. on the edge between
>>> RUNNING and WAITING.
>>>
>>
>> Apparently yes. :\
>>
>> (Secretly fixed up in my reply.)
>>
>>> Might push the limits of ASCII art, but here goes anyway: can we label
>>> the arrows with job verbs?
>>>
>>
>> Can you recommend a tool to help me do that? I've been using asciiflow
>> infinity (http://asciiflow.com) and it's not very good, but I don't have
>> anything better.
> 
> I do my ASCII art in Emacs picture-mode.
> 
>>> Can you briefly explain how this state machine addresses (1) and (2)?
>>>
>>
>> (1) The CONCLUDED state allows jobs to persist in the job query list
>> after they would have disappeared in 2.11-era QEMU. This lets us query
>> for completion codes and to dismiss the job at our own leisure.
> 
> Got it.
> 
>> (2) The PENDING state allows jobs to wait in a nearly-completed state,
>> pending authorization from the QMP client to make graph changes.
>> Otherwise, the job has to asynchronously perform this cleanup and the
>> exact point in time is unknowable to the QMP client. By making a PENDING
>> state and a finalize callback (.prepare), we can make this portion of a
>> job's task synchronous.
> 
> This provides for jobs modifying the graph on job completion.  It
> doesn't provide for jobs modifying the graph while they run.  Fine with
> me; we're not aware of a use for messing with the graph in the middle of
> a job.
> 

I didn't consider this possibility. The concept could in theory be
expanded to arbitrary sync points, but I'm not going to worry about that
until the need arises.

>> "John, you added more than two states..."
>>
>> Yup, this was to help simplify the existing state machine, believe it or
>> not. I modeled all jobs as transactions to eliminate different cleanup
>> routing and added two new interim states;
>>
>> - WAITING
>> - ABORTING
>>
>> to help make assertions about the valid transitions jobs can make. The
>> ABORTING state helps make it clear when a job is allowed to fail (and
>> emit QMP events related to such).
>>
>> The WAITING state is simply advisory to help a client know that a job is
>> "finished" but cannot yet receive further instruction because of peers
>> in a transaction. This helps me to add nice QMP errors for any verbs
>> issued to such jobs. "Sorry pal, this job is waiting and can't hear you
>> right now!"
>>
>> This kept the code cleaner than adding a bunch of very fragile boolean
>> error-checking pathways in dozens of helper functions to help avoid
>> illegal instructions on jobs not prepared to receive those instructions.
>>
>> So these two new states don't help accomplish (1) or (2) strictly, but
>> they do facilitate the code additions that _do_ a lot less ugly.
> 

I really bungled that sentence.

> Thanks!
> 
> Looks like a fine starting point for in-tree state machine documentation
> :)
> 

  reply	other threads:[~2018-04-18 17:30 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
2018-03-12 18:13   ` Jeff Cody
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 02/21] blockjobs: model single jobs as transactions John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 03/21] Blockjobs: documentation touchup John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 04/21] blockjobs: add status enum John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 05/21] blockjobs: add state transition table John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 06/21] iotests: add pause_wait John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 07/21] blockjobs: add block_job_verb permission table John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 08/21] blockjobs: add ABORTING state John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 09/21] blockjobs: add CONCLUDED state John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state John Snow
2018-03-12 15:28   ` Kevin Wolf
2018-03-12 15:41     ` John Snow
2018-03-12 16:07       ` Kevin Wolf
2018-03-12 16:23         ` John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 11/21] blockjobs: add block_job_dismiss John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 14/21] blockjobs: add block_job_txn_apply function John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 15/21] blockjobs: add prepare callback John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 16/21] blockjobs: add waiting status John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 17/21] blockjobs: add PENDING status and event John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 18/21] blockjobs: add block-job-finalize John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 19/21] blockjobs: Expose manual property John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 20/21] iotests: test manual job dismissal John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 21/21] tests/test-blockjob: test cancellations John Snow
2018-03-12 16:23 ` [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management Kevin Wolf
2018-03-12 17:51 ` no-reply
2018-04-17 13:44 ` Markus Armbruster
2018-04-17 13:47   ` Markus Armbruster
2018-04-17 16:56   ` John Snow
2018-04-18  7:25     ` Markus Armbruster
2018-04-18 17:29       ` John Snow [this message]
2018-04-18 17:42         ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a57da6e-b7a3-16e2-c404-b35d29e1b317@redhat.com \
    --to=jsnow@redhat.com \
    --cc=armbru@redhat.com \
    --cc=jtc@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).