All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: vsementsov@virtuozzo.com, famz@redhat.com, armbru@redhat.com,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
Date: Thu, 5 Nov 2015 13:52:25 -0500	[thread overview]
Message-ID: <563BA569.6010602@redhat.com> (raw)
In-Reply-To: <20151105104759.GA3348@stefanha-x1.localdomain>



On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>
>>
>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>>      BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>>                                               common, common);
>>>>  
>>>> +    if (action_check_cancel_mode(common, errp) < 0) {
>>>> +        return;
>>>> +    }
>>>> +
>>>>      action = common->action->block_dirty_bitmap_add;
>>>>      /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>>      qmp_block_dirty_bitmap_add(action->node, action->name,
>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>>                                               common, common);
>>>>      BlockDirtyBitmap *action;
>>>>  
>>>> +    if (action_check_cancel_mode(common, errp) < 0) {
>>>> +        return;
>>>> +    }
>>>> +
>>>>      action = common->action->block_dirty_bitmap_clear;
>>>>      state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>>                                                action->name,
>>>
>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>
>>> I understand why other block jobs don't support it, but it's not clear
>>> why these non-block job actions cannot.
>>>
>>
>> Because they don't have a callback to invoke if the rest of the job fails.
>>
>> I could create a BlockJob for them complete with a callback to invoke,
>> but basically it's just because there's no interface to unwind them, or
>> an interface to join them with the transaction.
>>
>> They're small, synchronous non-job actions. Which makes them weird.
> 
> Funny, we've been looking at the same picture while seeing different
> things:
> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
> 
> I think I understand your idea: the transaction should include both
> immediate actions as well as block jobs.
> 
> My mental model was different: immediate actions commit/abort along with
> the 'transaction' command.  Block jobs are separate and complete/cancel
> together in a group.
> 
> In practice I think the two end up being similar because we won't be
> able to implement immediate action commit/abort together with
> long-running block jobs because the immediate actions rely on
> quiescing/pausing the guest for atomic commit/abort.
> 
> So with your mental model the QMP client has to submit 2 'transaction'
> commands: 1 for the immediate actions, 1 for the block jobs.
> 
> In my mental model the QMP client submits 1 command but the immediate
> actions and block jobs are two separate transaction scopes.  This means
> if the block jobs fail, the client needs to be aware of the immediate
> actions that have committed.  Because of this, it becomes just as much
> client effort as submitting two separate 'transaction' commands in your
> model.
> 
> Can anyone see a practical difference?  I think I'm happy with John's
> model.
> 
> Stefan
> 

We discussed this off-list, but for the sake of the archive:

== How it is now ==

Currently, transactions have two implicit phases: the first is the
synchronous phase. If this phase completes successfully, we consider the
transaction a success. The second phase is the asynchronous phase where
jobs launched by the synchronous phase run to completion.

all synchronous commands must complete for the transaction to "succeed."
There are currently (pre-patch) no guarantees about asynchronous command
completion. As long as all synchronous actions complete, asynchronous
actions are free to succeed or fail individually.

== My Model ==

The current behavior is my "err-cancel = none" scenario: we offer no
guarantee about the success or failure of the transaction as a whole
after the synchronous portion has completed.

What I was proposing is "err-cancel = all," which to me means that _ALL_
commands in this transaction are to succeed (synchronous or not) before
_any_ actions are irrevocably committed. This means that for a
hypothetical mixed synchronous-asynchronous transaction, that even after
the transaction succeeded (it passed the synchronous phase), if an
asynchronous action later fails, all actions both synchronous and non
are rolled-back -- a kind of retroactive failure of the transaction.
This is clearly not possible in all cases, so commands that cannot
support these semantics will refuse "err-cancel = all" during the
synchronous phase.

In practice, only asynchronous actions can tolerate these semantics, but
from a user perspective, it's clear that any transaction successfully
launched with "err-cancel = all" applies to *all* actions, regardless.

== Stefan's Model ==

Stefan's model was to imply that the "err-cancel" parameter applied only
to the *asynchronous* phase, because the synchronous phase has already
reported back success to the user as a return from the qmp-transaction
command. This would mean that to Stefan, "err-cancel = all" was implying
that only the asynchronous actions had to participate in the "all or
none" behavior of the transaction -- synchronous portions were exempt.

== Equivalence ==

Both models wind up being equivalent:

In Stefan's model, you need no foreknowledge of which actions are
synchronous or not. Upon failure during the asynchronous phase you will
need to understand which actions rolled back and which ones didn't, however.

In my model, you need foreknowledge of which actions are synchronous and
which ones are not, because synchronous actions will refuse the
"err-cancel = all" parameter. There is no sifting through failure states
when the command fails.

It's mostly a matter of when you need to know the difference between the
two classes of actions. In one model, it's before. In the other, it's
after a failure.

My model also allows for an emulation of Stefan's model, using the
hypothetical "err-cancel = jobs-only" mode, which would only enforce the
transaction semantics in the asynchronous phase.


For this reason, I think the two approaches to thinking about the
problem wind up having the same effect. I would perhaps argue that my
model is more explicit -- but I'm biased. I wrote it :)

--js

  reply	other threads:[~2015-11-05 18:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 02/14] iotests: add transactional incremental backup test John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps John Snow
2015-11-03 16:33   ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-11-03 17:32     ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 04/14] backup: Extract dirty bitmap handling as a separate function John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 05/14] blockjob: Introduce reference count and fix reference to job->bs John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 06/14] blockjob: Add .commit and .abort block job actions John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 07/14] blockjob: Add "completed" and "ret" in BlockJob John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync John Snow
2015-11-03 13:52   ` Stefan Hajnoczi
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 09/14] block: Add block job transactions John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 10/14] block/backup: Rely on commit/abort for cleanup John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 11/14] block: Add BlockJobTxn support to backup_run John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
2015-11-03 15:17   ` Stefan Hajnoczi
2015-11-03 17:27     ` John Snow
2015-11-05 10:47       ` Stefan Hajnoczi
2015-11-05 18:52         ` John Snow [this message]
2015-11-05 19:35           ` Markus Armbruster
2015-11-05 19:43             ` John Snow
2015-11-06  8:32           ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2015-11-06 16:36             ` Stefan Hajnoczi
2015-11-06 18:46             ` John Snow
2015-11-06 20:35               ` John Snow
2015-11-03 15:23   ` [Qemu-devel] " Eric Blake
2015-11-03 17:31     ` John Snow
2015-11-03 17:35       ` Eric Blake
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 13/14] iotests: 124 - transactional failure test John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 14/14] tests: add BlockJobTxn unit test John Snow
2015-11-02 21:06 ` [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-11-03 15:22 ` Stefan Hajnoczi
2015-11-03 17:46   ` John Snow
2015-11-03 22:07     ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563BA569.6010602@redhat.com \
    --to=jsnow@redhat.com \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.