qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 09/27] migration: add reporting of errors for outgoing migration
Date: Fri, 4 Mar 2016 10:49:46 +0000	[thread overview]
Message-ID: <20160304104946.GD22691@redhat.com> (raw)
In-Reply-To: <8760x21sk0.fsf@blackfin.pond.sub.org>

On Fri, Mar 04, 2016 at 10:49:35AM +0100, Markus Armbruster wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > Currently if an app initiates an outgoing migration, it
> 
> application
> 
> > may or may not, get an error reported back on failure. If
> > the error occurs synchronously to the 'migrate' command
> > execution, the client app will see the error message. This
> > is the case for DNS lookup failures. If the error occurs
> > asynchronously to the monitor command though, the error
> > will be thrown away and the client left guessing about
> > what went wrong. This is the case for failure to connect
> > to the TCP server (eg due to wrong port, or firewall
> > rules, or other similar errors).
> >
> > In the future we'll be adding more scope for errors to
> > happen asynchronously with the TLS protocol handshake.
> > TLS errors are hard to diagnose even when they are well
> > reported, so discarding errors entirely will make it
> > impossible to debug TLS connection problems.
> >
> > Management apps which do migration are already using
> > 'query-migrate' / 'info migrate' to check up on progress
> > of background migration operations and to see their end
> > status. This is a fine place to also include the error
> > message when things go wrong.
> >
> > This patch thus adds an 'error-desc' field to the
> > MigrationInfo struct, which will be populated when
> > the 'status' is set to 'failed':
> >
> > (qemu) migrate -d tcp:localhost:9001
> > (qemu) info migrate
> > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
> > Migration status: failed
> > total time: 0 milliseconds
> > error description: Error connecting to socket: Connection refused
> 
> Perhaps:
> 
>   Migration status: failed (Error connecting to socket: Connection refused)
> 
> Just an idea, use whatever you like better.

Yeah, that is nicer for the HMP.

> > In the HMP, when doing non-detached migration, it is
> > also possible to display this error message directly
> > to the app.
> >
> > (qemu) migrate tcp:localhost:9001
> > Error connecting to socket: Connection refused
> 
> You could include a QMP example if you like.

Sure, will add it.


> > @@ -853,12 +857,14 @@ static void migrate_fd_cleanup(void *opaque)
> >      notifier_list_notify(&migration_state_notifiers, s);
> >  }
> >  
> > -void migrate_fd_error(MigrationState *s)
> > +void migrate_fd_error(MigrationState *s, const Error *error)
> >  {
> > -    trace_migrate_fd_error();
> > +    trace_migrate_fd_error(error ? error_get_pretty(error) : "");
> >      assert(s->to_dst_file == NULL);
> >      migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> >                        MIGRATION_STATUS_FAILED);
> > +    error_free(s->error);
> > +    s->error = error_copy(error);
> 
> Can migrate_fd_error() be called more than once per migration attempt?

I think so, but I felt it worth being paranoid against mistakes
so I chose to call error_free() just in case.

> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index 7b8f2a1..ff89747 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -484,6 +484,8 @@
> >  #       throttled during auto-converge. This is only present when auto-converge
> >  #       has started throttling guest cpus. (Since 2.5)
> >  #
> > +# @error-desc: the error description string, when @status == 'failed' (Since 2.6)
> > +#
> 
> "when @status == 'failed'" is a semantic constraint, not visible in
> query-qmp-schema.  Several existing members have similar constraints.
> 
> Making this a flat union tagged by @status would turn semantic
> constraints into syntax, visible in query-qmp-schema.  Not sure it's
> worth the churn now.
> 
> Note that qmp_query_migrate() may not actually set @error-desc when
> @status is 'failed'.  Bug in either the code or the documentation.

Bug in docs :-)

> Further note that code uses it without also checking status.  You could
> assert the constraint holds.  Your choice.
> 
> Let me take a step back and examine the bigger picture.
> 
> Migration is a long-running task with a non-trivial live cycle (see enum
> MigrationStatus).  Similar tasks exist in the block layer ("block
> jobs"), and perhaps we can have a generic "jobs" abstraction some day.
> 
> Originally, QMP was designed to permit doing long-running tasks as
> asynchronous commands.  We ended up doing them as synchronous commands +
> events + status queries.  Two reasons, one accidental, one fundamental.
> 
> The accidental one is asynchronous commands never quite worked, so
> nobody used them, so nobody bothered to fix them.
> 
> The fundamental one is complex life cycles.  A job that starts, runs
> silently for a while, then either completes or fails can be done nicely
> as asynchronous command.  But when the job's life cycle is more complex,
> a single command reply isn't enough.  Thus commands + events + status
> queries.
> 
> In this model, the reply to a status query in a "failed" state takes the
> role of the asynchronous command's error reply.  It therefore makes
> sense to compare the two.
> 
> The QMP error reply is documented as follows in qmp-spec.txt:
> 
>     The format of an error response is:
> 
>     { "error": { "class": json-string, "desc": json-string }, "id": json-value }
> 
>      Where,
> 
>     - The "class" member contains the error class name (eg. "GenericError")
>     - The "desc" member is a human-readable error message. Clients should
>       not attempt to parse this message.
>     - The "id" member contains the transaction identification associated with
>       the command execution if issued by the Client
> 
> MigrationInfo in a "failed" contains @status and @error-desc, and may
> contain additional members like @total-time (not entirely clear from
> documentation, which means the documentation is too vague).
> 
> MigrationInfo covers @desc, but not @class and @id.
> 
> Use of @class is discouraged nowadays, and omitting it here at least
> until we have a compelling use for it makes sense.
> 
> @id ties the asynchronous reply to the command.  Not necessary as long
> as only one migration task can exist.
> 
> The only thing I'd like you to add to MigrationInfo is the "Clients
> should not attempt to parse this" admonition.  Suggest to describe
> @error-desc as "human-readable" while there.

Yes, makes sense.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

  reply	other threads:[~2016-03-04 10:49 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-26 15:10 [Qemu-devel] [PATCH v3 00/27] Convert migration to QIOChannel & support TLS Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 01/27] s390: use FILE instead of QEMUFile for creating text file Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 02/27] migration: remove use of qemu_bufopen from vmstate tests Daniel P. Berrange
2016-03-03  8:43   ` Amit Shah
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 03/27] migration: ensure qemu_fflush() always writes full data amount Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 04/27] migration: split migration hooks out of QEMUFileOps Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 05/27] migration: introduce set_blocking function in QEMUFileOps Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 06/27] migration: force QEMUFile to blocking mode for outgoing migration Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 07/27] migration: introduce a new QEMUFile impl based on QIOChannel Daniel P. Berrange
2016-03-10 14:44   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 08/27] migration: add helpers for creating QEMUFile from a QIOChannel Daniel P. Berrange
2016-03-10 14:52   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 09/27] migration: add reporting of errors for outgoing migration Daniel P. Berrange
2016-03-04  9:49   ` Markus Armbruster
2016-03-04 10:49     ` Daniel P. Berrange [this message]
2016-03-10 15:13   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 10/27] migration: convert post-copy to use QIOChannelBuffer Daniel P. Berrange
2016-03-10 15:25   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 11/27] migration: convert unix socket protocol to use QIOChannel Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 12/27] migration: rename unix.c to socket.c Daniel P. Berrange
2016-03-10 15:35   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 13/27] migration: convert tcp socket protocol to use QIOChannel Daniel P. Berrange
2016-03-10 15:38   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 14/27] migration: convert fd " Daniel P. Berrange
2016-03-10 15:46   ` Dr. David Alan Gilbert
2016-03-10 15:56     ` Daniel P. Berrange
2016-03-10 17:27       ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 15/27] migration: convert exec " Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 16/27] migration: convert RDMA to use QIOChannel interface Daniel P. Berrange
2016-03-10 17:00   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 17/27] migration: convert savevm to use QIOChannel for writing to files Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 18/27] migration: delete QEMUFile buffer implementation Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 19/27] migration: delete QEMUSizedBuffer struct Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 20/27] migration: delete QEMUFile sockets implementation Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 21/27] migration: delete QEMUFile stdio implementation Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 22/27] migration: move definition of struct QEMUFile back into qemu-file.c Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 23/27] migration: don't use an array for storing migrate parameters Daniel P. Berrange
2016-03-10 17:25   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 24/27] migration: define 'tls-creds' and 'tls-hostname' migration parameters Daniel P. Berrange
2016-03-10 17:42   ` Dr. David Alan Gilbert
2016-03-10 17:50     ` Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 25/27] migration: add support for encrypting data with TLS Daniel P. Berrange
2016-03-10 18:25   ` Dr. David Alan Gilbert
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 26/27] migration: remove support for non-iovec based write handlers Daniel P. Berrange
2016-02-26 15:10 ` [Qemu-devel] [PATCH v3 27/27] migration: remove qemu_get_fd method from QEMUFile Daniel P. Berrange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304104946.GD22691@redhat.com \
    --to=berrange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).