qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Zhang, Chen" <chen.zhang@intel.com>
To: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	qemu-dev <qemu-devel@nongnu.org>,
	"Eric Blake" <eblake@redhat.com>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>
Cc: Zhang Chen <zhangckid@gmail.com>
Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module
Date: Mon, 9 Mar 2020 17:32:11 +0800	[thread overview]
Message-ID: <c16e2cda-a498-4913-2811-6c6ea6b0b0df@intel.com> (raw)
In-Reply-To: <029efd6f-f461-da09-c6b6-1acf67818b64@redhat.com>


On 3/4/2020 9:37 PM, Paolo Bonzini wrote:
> On 04/03/20 09:06, Zhang, Chen wrote:
>>> Hi Eric and Paolo, Can you give some comments about this series?
>>>
>>>
>> No news for a while...
>> We already have some users(Cloud Service Provider) try to use is module in their product.
>> But they also need to follow the Qemu upstream code.
> My main comment about this series is that it's not clear why it is
> needed and how to use it.  The documentation includes a demo, but no
> description of what is an awd_node, a notification_node and an
> opt_script.  I can more or less understand the notification_node and
> opt_script role from the documentation, but not entirely because, for
> example, the two-host demo has hardcoded IP addresses without saying
> which host is which IP address.

Hi Paolo,

Sorry for slow reply and thank you for your comments.

Let me summarize your main opinions and methods:

1. Why AWD is needed.

Advanced Watch Dog is an universal monitoring module on VMM side, it can 
be used to detect network down(VMM to guest, VMM to VMM, VMM to another 
remote server) and do previously set operation. Current AWD patch just 
accept any input as the signal to refresh the watchdog timer, and we can 
also make a certain interactive protocol here. For the outputs, user can 
pre-write some command or some messages in the AWD opt-script. We 
noticed that there is no way for VMM communicate directly, maybe some 
people think we don't need such things(up layer software like openstack 
can handle it). so we engaged with real customer found that they need a 
lightweight and efficient mechanism to solve some practical problems,

For example Edge Computing cases(they think high level software is too 
heavy to use in Edge or it is hard to manage and combine with VM instance).
It make user have basic VM/Host network monitoring tools and basic false 
tolerance and recovery solution.

For COLO FT/HA solution, we already have some CSPs try to use AWD with COLO.

2. Documentation issues, include how to use it.

I will address all your comments and complete details about documentation.

3. Communication protocol issue.

Current AWD without any protocol, any data it gets will be considered a 
heartbeat signal.

I think use QMP format is good for me.

4. Implementation issue.

The AWD script as an optional feature is OK for me.

And report the triggering of the watchdog via QMP events is enough for 
current usage.

But it looks have limitation to notify outside Qemu. I don't know which 
is better choice.

If the QMP events solution is better, I will fix it in next version.


I don't know if I understand your means correctly.

Please give me more guidance on this series.  :-)

Thanks

Zhang Chen


>
> The documentation does not describe the protocol, which is absolutely
> necessary, and does not describe _why_ the protocol was designed like
> that.  Without such documentation it's not clear if, for example, the
> watchdog protocol could be implemented as QMP commands (e.g.
> start-watchdog, stop-watchdog, notify-watchdog).  Another possibility
> could be to use the systemd watchdog protocol, which consists of
> essentially three commands (WATCHDOG=1, WATCHDOG=trigger,
> WATCHDOG_USEC=...) which are transmitted as datagrams.  Documentation is
> important for reviewers to judge the merits of the protocol without (or
> before) diving into the code.
>
> In the demo, the opt_script mechanism is currently using the "human"
> monitor as opposed to QMP.  The human monitor interface is not stable
> and not meant for consumption by management interface.  It is not clear
> if this is just a sample usage, and in practice the notification_node
> would be outside of QEMU, or not.  In general I would prefer to have the
> script as an optional feature, and report the triggering of the watchdog
> via QMP events.
>
> Paolo
>


  reply	other threads:[~2020-03-09  9:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 12:45 [PATCH V4 0/5] Introduce Advanced Watch Dog module Zhang Chen
2019-12-17 12:45 ` [PATCH V4 1/5] net/awd.c: Introduce Advanced Watch Dog module framework Zhang Chen
2019-12-17 12:45 ` [PATCH V4 2/5] net/awd.c: Initailize input/output chardev Zhang Chen
2019-12-17 12:45 ` [PATCH V4 3/5] net/awd.c: Load advanced watch dog worker thread job Zhang Chen
2019-12-17 12:45 ` [PATCH V4 4/5] vl.c: Make Advanced Watch Dog delayed initialization Zhang Chen
2019-12-17 12:45 ` [PATCH V4 5/5] docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module Zhang Chen
2020-01-07  4:32 ` [PATCH V4 0/5] Introduce Advanced Watch Dog module Zhang, Chen
2020-01-19  9:10   ` Zhang, Chen
2020-01-20  2:56     ` Jason Wang
2020-02-11  8:58       ` Zhang, Chen
2020-02-12  2:56         ` Jason Wang
2020-02-20  3:36           ` Zhang, Chen
2020-03-04  8:06             ` Zhang, Chen
2020-03-04 13:37               ` Paolo Bonzini
2020-03-09  9:32                 ` Zhang, Chen [this message]
2020-03-12 15:52                   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c16e2cda-a498-4913-2811-6c6ea6b0b0df@intel.com \
    --to=chen.zhang@intel.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhangckid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).