All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: corbet@lwn.net, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	farman@linux.ibm.com, mjrosato@linux.ibm.com,
	pasic@linux.ibm.com
Subject: Re: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state
Date: Tue, 14 Dec 2021 13:08:31 +0100	[thread overview]
Message-ID: <87ee6fs81s.fsf@redhat.com> (raw)
In-Reply-To: <20211213134038.39bb0618.alex.williamson@redhat.com>

On Mon, Dec 13 2021, Alex Williamson <alex.williamson@redhat.com> wrote:

> We do specify that a migration driver has discretion in using the error
> state for failed transitions, so there are options to simplify error
> handling.
>
> If we look at bit flips, we have:
>
> Initial state
> |  Resuming
> |  |  Saving
> |  |  |  Running
> |  |  |  |  Next states with multiple bit flips
>
> a) 0, 0, 0  (d)
> b) 0, 0, 1  (c) (e)
> c) 0, 1, 0  (b) (e)
> d) 0, 1, 1  (a) (e)
> e) 1, 0, 0  (b) (c) (d)
> f) 1, 0, 1 UNSUPPORTED
> g) 1, 1, 0 ERROR
> h) 1, 1, 1 INVALID
>
> We specify that we cannot pass through any invalid states during
> transition, so if I consider each bit to be a discrete operation and
> map all these multi-bit changes to a sequence of single bit flips, the
> only requirements are:
>
>   1) RESUMING must be cleared before setting SAVING or RUNNING
>   2) SAVING or RUNNING must be cleared before setting RESUMING
>
> I think the basis of your priority scheme comes from that.  Ordering
> of the remaining items is more subtle though, for instance
> 0 -> SAVING | RUNNING can be broken down as:
>
>   0 -> SAVING
>   SAVING -> SAVING | RUNNING 
>
>   vs
>
>   0 -> RUNNING
>   RUNNING -> SAVING | RUNNING
>
> I'd give preference to enabling logging before running and I believe
> that holds for transition (e) -> (d) as well.

Agreed.

>
> In the reverse case, SAVING | RUNNING -> 0
>
>   SAVING | RUNNING -> RUNNING
>   RUNNING -> 0
>
>   vs
>
>   SAVING | RUNNING -> SAVING
>   SAVING -> 0
>
> This one is more arbitrary.  I tend to favor clearing RUNNING to stop
> the device first, largely because it creates nice symmetry in the
> resulting algorithm and follows the general principle that you
> discovered as well, converge towards zero by addressing bit clearing
> before setting.

That also makes sense to me.

> So a valid algorithm would include:
>
> int set_device_state(u32 old, u32 new, bool is_unwind)
> {
> 	if (old.RUNNING && !new.RUNNING) {
> 		curr.RUNNING = 0;
> 		if (ERROR) goto unwind;
> 	}
> 	if (old.SAVING && !new.SAVING) {
> 		curr.SAVING = 0;
> 		if (ERROR) goto unwind;
> 	}
> 	if (old.RESUMING && !new.RESUMING) {
> 		curr.RESUMING = 0;
> 		if (ERROR) goto unwind;
> 	}
> 	if (!old.RESUMING && new.RESUMING) {
> 		curr.RESUMING = 1;
> 		if (ERROR) goto unwind;
> 	}
> 	if (!old.SAVING && new.SAVING) {
> 		curr.saving = 1;
> 		if (ERROR) goto unwind;
> 	}
> 	if (!old.RUNNING && new.RUNNING) {
> 		curr.RUNNING = 1;
> 		if (ERROR) goto unwind;
>         }
>
> 	return 0;
>
> unwind:
> 	if (!is_unwind) {
> 		ret = set_device_state(curr, old, true);
> 		if (ret) {
> 			curr.raw = ERROR;
> 			return ret;
> 		}
> 	}
>
> 	return -EIO;
> }
>

I've stared at this and scribbled a bit on paper and I think this would
work.

> And I think that also addresses the claim that we're doomed to untested
> and complicated error code handling, we unwind by simply swapping the
> args to our set state function and enter the ERROR state should that
> recursive call fail.

Nod. As we clear RUNNING as the first thing and would only set RUNNING
again as the last step, any transition to ERROR should be save.

>
> I think it would be reasonable for documentation to present similar
> pseudo code as a recommendation, but ultimately the migration driver
> needs to come up with something that fits all the requirements.

Yes, something like this should go into Documentation/.


  reply	other threads:[~2021-12-14 12:08 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 23:34 [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state Alex Williamson
2021-12-10  1:25 ` Jason Gunthorpe
2021-12-13 20:40   ` Alex Williamson
2021-12-14 12:08     ` Cornelia Huck [this message]
2021-12-14 16:26     ` Jason Gunthorpe
2021-12-20 22:26       ` Alex Williamson
2022-01-04 20:28         ` Jason Gunthorpe
2022-01-06 18:17           ` Alex Williamson
2022-01-06 21:20             ` Jason Gunthorpe
2022-01-10  7:55               ` Tian, Kevin
2022-01-10 17:34                 ` Alex Williamson
2022-01-11  2:41                   ` Tian, Kevin
2022-01-10 18:11                 ` Jason Gunthorpe
2022-01-11  3:14                   ` Tian, Kevin
2022-01-11 18:19                     ` Jason Gunthorpe
2022-01-04  3:49       ` Tian, Kevin
2022-01-04 16:09         ` Jason Gunthorpe
2022-01-05  1:59           ` Tian, Kevin
2022-01-05 12:45             ` Jason Gunthorpe
2022-01-06  6:32               ` Tian, Kevin
2022-01-06 15:42                 ` Jason Gunthorpe
2022-01-07  0:00                   ` Tian, Kevin
2022-01-07  0:29                     ` Jason Gunthorpe
2022-01-07  2:01                       ` Tian, Kevin
2022-01-07 17:23                         ` Jason Gunthorpe
2022-01-10  3:14                           ` Tian, Kevin
2022-01-10 17:52                             ` Jason Gunthorpe
2022-01-11  2:57                               ` Tian, Kevin
2022-01-05  3:06           ` Tian, Kevin
2021-12-20 17:38 ` Cornelia Huck
2021-12-20 22:49   ` Alex Williamson
2021-12-21 11:24     ` Cornelia Huck
2022-01-07  8:03 ` Tian, Kevin
2022-01-07 16:36   ` Alex Williamson
2022-01-10  6:01     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ee6fs81s.fsf@redhat.com \
    --to=cohuck@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=corbet@lwn.net \
    --cc=farman@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.