From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: John Snow <jsnow@redhat.com>, Markus Armbruster <armbru@redhat.com>
Cc: qemu-block@nongnu.org, lizhijian@cn.fujitsu.com,
quintela@redhat.com, qemu-devel@nongnu.org,
yunhong.jiang@intel.com, eddie.dong@intel.com,
peter.huangpeng@huawei.com,
Michael Roth <mdroth@linux.vnet.ibm.com>,
arei.gonglei@huawei.com, stefanha@redhat.com,
amit.shah@redhat.com, dgilbert@redhat.com,
hongyang.yang@easystack.cn
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Frame v12 25/38] qmp event: Add event notification for COLO error
Date: Wed, 23 Dec 2015 11:14:01 +0800 [thread overview]
Message-ID: <567A1179.2040509@huawei.com> (raw)
In-Reply-To: <56786BA0.70400@redhat.com>
On 2015/12/22 5:14, John Snow wrote:
>
>
> On 12/19/2015 05:02 AM, Markus Armbruster wrote:
>> Copying qemu-block because this seems related to generalising block jobs
>> to background jobs.
>>
>> zhanghailiang <zhang.zhanghailiang@huawei.com> writes:
>>
>>> If some errors happen during VM's COLO FT stage, it's important to notify the users
>>> of this event. Together with 'colo_lost_heartbeat', users can intervene in COLO's
>>> failover work immediately.
>>> If users don't want to get involved in COLO's failover verdict,
>>> it is still necessary to notify users that we exited COLO mode.
>>>
>>> Cc: Markus Armbruster <armbru@redhat.com>
>>> Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>> ---
>>> v11:
>>> - Fix several typos found by Eric
>>>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> ---
>>> docs/qmp-events.txt | 17 +++++++++++++++++
>>> migration/colo.c | 11 +++++++++++
>>> qapi-schema.json | 16 ++++++++++++++++
>>> qapi/event.json | 17 +++++++++++++++++
>>> 4 files changed, 61 insertions(+)
>>>
>>> diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
>>> index d2f1ce4..19f68fc 100644
>>> --- a/docs/qmp-events.txt
>>> +++ b/docs/qmp-events.txt
>>> @@ -184,6 +184,23 @@ Example:
>>> Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
>>> event.
>>>
>>> +COLO_EXIT
>>> +---------
>>> +
>>> +Emitted when VM finishes COLO mode due to some errors happening or
>>> +at the request of users.
>>
>> How would the event's recipient distinguish between "due to error" and
>> "at the user's request"?
>>
>>> +
>>> +Data:
>>> +
>>> + - "mode": COLO mode, primary or secondary side (json-string)
>>> + - "reason": the exit reason, internal error or external request. (json-string)
>>> + - "error": error message (json-string, operation)
>>> +
>>> +Example:
>>> +
>>> +{"timestamp": {"seconds": 2032141960, "microseconds": 417172},
>>> + "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
>>> +
>>
>> Pardon my ignorance again... Does "VM finishes COLO mode" means have
>> some kind of COLO background job, and it just finished for whatever
>> reason?
>>
>> If yes, this COLO job could be an instance of the general background job
>> concept we're trying to grow from the existing block job concept.
>>
>> I'm not asking you to rebase your work onto the background job
>> infrastructure, not least for the simple reason that it doesn't exist,
>> yet. But I think it would be fruitful to compare your COLO job
>> management QMP interface with the one we have for block jobs. Not only
>> may that avoid unnecessary inconsistency, it could also help shape the
>> general background job interface.
>>
>
> Yes. The "background job" concept doesn't exist in a formal way outside
> of the block layer yet, but we're looking to expand it as we re-tool the
> block jobs themselves.
>
> It may be the case that the COLO commands and events need to go in as
> they are now, but later we can bring them back into the generalized job
> infrastructure.
>
Agreed. ;)
>> Quick overview of the block job QMP interface:
>>
>> * Commands to create a job: block-commit, block-stream, drive-mirror,
>> drive-backup.
>>
>> * Get information on jobs: query-block-jobs
>>
>> * Pause a job: block-job-pause
>>
>> * Resume a job: block-job-resume
>>
>> * Cancel a job: block-job-cancel
>>
>> * Block job completion events: BLOCK_JOB_COMPLETED, BLOCK_JOB_CANCELLED
>>
>> * Block job error event: BLOCK_JOB_ERROR
>>
>> * Block job synchronous completion: event BLOCK_JOB_READY and command
>> block-job-complete
>>
>
> The block-agnostic version of these commands would likely be:
>
> query-jobs
> job-pause
> job-resume
> job-cancel
> job-complete
>
> Events: JOB_COMPLETED, JOB_CANCELLED, JOB_ERROR, JOB_READY.
>
>
> It looks like COLO_EXIT would be an instance of JOB_COMPLETED, and if it
> occurred due to an error, we'd also see JOB_ERROR emitted.
>
Yes, if we use this job frame for COLO, the COLO_EXIT will be like that.
>>> DEVICE_DELETED
>>> --------------
>>>
>>> diff --git a/migration/colo.c b/migration/colo.c
>>> index d1dd4e1..d06c14f 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -18,6 +18,7 @@
>>> #include "qemu/error-report.h"
>>> #include "qemu/sockets.h"
>>> #include "migration/failover.h"
>>> +#include "qapi-event.h"
>>>
>>> /* colo buffer */
>>> #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>> @@ -349,6 +350,11 @@ static void colo_process_checkpoint(MigrationState *s)
>>> out:
>>> if (ret < 0) {
>>> error_report("%s: %s", __func__, strerror(-ret));
>>> + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
>>> + true, strerror(-ret), NULL);
>>> + } else {
>>> + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_REQUEST,
>>> + false, NULL, NULL);
>>> }
>>>
>>> qsb_free(buffer);
>>> @@ -516,6 +522,11 @@ out:
>>> if (ret < 0) {
>>> error_report("colo incoming thread will exit, detect error: %s",
>>> strerror(-ret));
>>> + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_ERROR,
>>> + true, strerror(-ret), NULL);
>>> + } else {
>>> + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_REQUEST,
>>> + false, NULL, NULL);
>>> }
>>>
>>> if (fb) {
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index feb7d53..f6ecb88 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -778,6 +778,22 @@
>>> 'data': [ 'unknown', 'primary', 'secondary'] }
>>>
>>> ##
>>> +# @COLOExitReason
>>> +#
>>> +# The reason for a COLO exit
>>> +#
>>> +# @unknown: unknown reason
>>
>> How can @unknown happen?
>>
>>> +#
>>> +# @request: COLO exit is due to an external request
>>> +#
>>> +# @error: COLO exit is due to an internal error
>>> +#
>>> +# Since: 2.6
>>> +##
>>> +{ 'enum': 'COLOExitReason',
>>> + 'data': [ 'unknown', 'request', 'error'] }
>>> +
>>> +##
>>> # @x-colo-lost-heartbeat
>>> #
>>> # Tell qemu that heartbeat is lost, request it to do takeover procedures.
>>> diff --git a/qapi/event.json b/qapi/event.json
>>> index f0cef01..f63d456 100644
>>> --- a/qapi/event.json
>>> +++ b/qapi/event.json
>>> @@ -255,6 +255,23 @@
>>> 'data': {'status': 'MigrationStatus'}}
>>>
>>> ##
>>> +# @COLO_EXIT
>>> +#
>>> +# Emitted when VM finishes COLO mode due to some errors happening or
>>> +# at the request of users.
>>> +#
>>> +# @mode: which COLO mode the VM was in when it exited.
>>
>> Can we get 'unknown' here?
>>
>>> +#
>>> +# @reason: describes the reason for the COLO exit.
>>
>> Can we get 'unknown' here?
>>
>>> +#
>>> +# @error: #optional, error message. Only present on error happening.
>>> +#
>>> +# Since: 2.6
>>> +##
>>> +{ 'event': 'COLO_EXIT',
>>> + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason', '*error': 'str' } }
>>> +
>>> +##
>>> # @ACPI_DEVICE_OST
>>> #
>>> # Emitted when guest executes ACPI _OST method.
>>
>
next prev parent reply other threads:[~2015-12-23 3:15 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-15 8:22 [Qemu-devel] [PATCH COLO-Frame v12 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-12-15 9:46 ` Wen Congyang
2015-12-15 11:19 ` Hailiang Zhang
2015-12-15 11:31 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 03/38] COLO: migrate colo related info to secondary node zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 04/38] migration: Export migrate_set_state() zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 05/38] migration: Add state records for migration incoming zhanghailiang
2015-12-15 17:36 ` Dr. David Alan Gilbert
2015-12-16 5:37 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 06/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 07/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 08/38] migration: Rename the'file' member of MigrationState zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 09/38] COLO/migration: Create a new communication path from destination to source zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol zhanghailiang
2015-12-18 14:52 ` Dr. David Alan Gilbert
2015-12-28 7:34 ` Hailiang Zhang
2015-12-19 8:54 ` Markus Armbruster
2015-12-22 7:00 ` Hailiang Zhang
2016-01-11 12:47 ` Markus Armbruster
2016-01-12 12:57 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 11/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-12-19 9:27 ` Markus Armbruster
2015-12-22 13:32 ` Hailiang Zhang
2016-01-11 13:16 ` Markus Armbruster
2016-01-12 12:54 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 12/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 13/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 14/38] ram: Split host_from_stream_offset() into two helper functions zhanghailiang
2015-12-18 15:18 ` Dr. David Alan Gilbert
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 15/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 16/38] ram/COLO: Record the dirty pages that SVM received zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 17/38] COLO: Load VMState into qsb before restore it zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 18/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2015-12-15 11:07 ` Changlong Xie
2015-12-25 3:03 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 19/38] COLO: Add checkpoint-delay parameter for migrate-set-parameters zhanghailiang
2015-12-19 9:33 ` Markus Armbruster
2015-12-22 13:43 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 20/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 21/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-12-18 15:27 ` Dr. David Alan Gilbert
2015-12-19 9:38 ` Markus Armbruster
2015-12-22 13:50 ` Hailiang Zhang
2015-12-25 2:27 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 22/38] COLO failover: Introduce state to record failover process zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 23/38] COLO: Implement failover work for Primary VM zhanghailiang
2015-12-18 15:35 ` Dr. David Alan Gilbert
2015-12-28 7:39 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 24/38] COLO: Implement failover work for Secondary VM zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 25/38] qmp event: Add event notification for COLO error zhanghailiang
2015-12-18 16:03 ` Eric Blake
2015-12-23 1:55 ` Hailiang Zhang
2015-12-19 10:02 ` Markus Armbruster
2015-12-21 21:14 ` [Qemu-devel] [Qemu-block] " John Snow
2015-12-23 3:14 ` Hailiang Zhang [this message]
2015-12-23 1:24 ` [Qemu-devel] " Wen Congyang
2016-01-05 19:21 ` [Qemu-devel] [Qemu-block] " John Snow
2015-12-23 3:10 ` [Qemu-devel] " Hailiang Zhang
2016-01-11 13:24 ` Markus Armbruster
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover zhanghailiang
2015-12-15 9:44 ` Dr. David Alan Gilbert
2015-12-15 10:23 ` Dr. David Alan Gilbert
2015-12-16 5:58 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 27/38] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-12-15 10:21 ` Dr. David Alan Gilbert
2015-12-25 1:02 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 28/38] COLO: Process shutdown command for VM in COLO state zhanghailiang
2015-12-15 11:31 ` Dr. David Alan Gilbert
2015-12-25 6:13 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 29/38] COLO: Update the global runstate after going into colo state zhanghailiang
2015-12-15 11:52 ` Dr. David Alan Gilbert
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 30/38] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
2015-12-15 12:08 ` Dr. David Alan Gilbert
2015-12-25 6:37 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 31/38] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2015-12-18 10:53 ` Dr. David Alan Gilbert
2015-12-28 3:46 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 32/38] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2015-12-18 12:01 ` Dr. David Alan Gilbert
2015-12-28 7:29 ` Hailiang Zhang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 33/38] net/filter-buffer: Add default filter-buffer for each netdev zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 34/38] filter-buffer: Accept zero interval zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 35/38] filter-buffer: Introduce a helper function to enable/disable default filter zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 36/38] filter-buffer: Introduce a helper function to release packets zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 37/38] colo: Use default buffer-filter to buffer and " zhanghailiang
2015-12-15 8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 38/38] COLO: Add block replication into colo process zhanghailiang
2015-12-15 12:14 ` [Qemu-devel] [PATCH COLO-Frame v12 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) Dr. David Alan Gilbert
2015-12-15 12:41 ` Hailiang Zhang
2015-12-17 10:52 ` Dr. David Alan Gilbert
2015-12-18 1:10 ` Hailiang Zhang
2015-12-18 15:47 ` Dr. David Alan Gilbert
2015-12-23 1:24 ` Hailiang Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=567A1179.2040509@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=amit.shah@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=hongyang.yang@easystack.cn \
--cc=jsnow@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).