qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Straub <lukasstraub2@web.de>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-block <qemu-block@nongnu.org>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Max Reitz" <mreitz@redhat.com>
Subject: Re: [PATCH v9 1/8] Introduce yank feature
Date: Fri, 30 Oct 2020 16:19:08 +0100	[thread overview]
Message-ID: <20201030161908.2f71ecd5@luklap> (raw)
In-Reply-To: <87lffnstge.fsf@dusky.pond.sub.org>

[-- Attachment #1: Type: text/plain, Size: 7284 bytes --]

On Fri, 30 Oct 2020 15:02:09 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Lukas Straub <lukasstraub2@web.de> writes:
> 
> > On Thu, 29 Oct 2020 17:36:14 +0100
> > Markus Armbruster <armbru@redhat.com> wrote:
> >  
> >> Nothing major, looks almost ready to me.
> >> 
> >> Lukas Straub <lukasstraub2@web.de> writes:
> >>   
> >> > The yank feature allows to recover from hanging qemu by "yanking"
> >> > at various parts. Other qemu systems can register themselves and
> >> > multiple yank functions. Then all yank functions for selected
> >> > instances can be called by the 'yank' out-of-band qmp command.
> >> > Available instances can be queried by a 'query-yank' oob command.
> >> >
> >> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> >> > Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> > ---
> >> >  include/qemu/yank.h |  95 ++++++++++++++++++++
> >> >  qapi/misc.json      | 106 ++++++++++++++++++++++
> >> >  util/meson.build    |   1 +
> >> >  util/yank.c         | 213 ++++++++++++++++++++++++++++++++++++++++++++    
> >> 
> >> checkpatch.pl warns:
> >> 
> >>     WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
> >> 
> >> Can we find a maintainer for the two new files?  
> >
> > Yes, I'm maintaining this for now, see patch 7.  
> 
> Thanks!  Would it make sense to add the yank stuff to a new QAPI module
> yank.json instead of misc.jaon, so the new MAINTAINERS stanza can cover
> it?

Yes, makes sense. Changed for the next version.

> [...]
> >> > diff --git a/qapi/misc.json b/qapi/misc.json
> >> > index 40df513856..3b7de02a4d 100644
> >> > --- a/qapi/misc.json
> >> > +++ b/qapi/misc.json  
> [...]
> >> > +##
> >> > +# @YankInstance:
> >> > +#
> >> > +# A yank instance can be yanked with the "yank" qmp command to recover from a
> >> > +# hanging qemu.    
> >> 
> >> QEMU
> >>   
> >> > +#
> >> > +# Currently implemented yank instances:
> >> > +#  -nbd block device:
> >> > +#   Yanking it will shutdown the connection to the nbd server without
> >> > +#   attempting to reconnect.
> >> > +#  -socket chardev:
> >> > +#   Yanking it will shutdown the connected socket.
> >> > +#  -migration:
> >> > +#   Yanking it will shutdown all migration connections.    
> >> 
> >> To my surprise, this is recognized as bullet list markup.  But please
> >> put a space between the bullet and the text anyway.
> >> 
> >> Also: "shutdown" is a noun, the verb is spelled "shut down".  
> >
> > Both changed for the next version.
> >  
> >> In my review of v8, I asked how yanking migration is related to command
> >> migrate_cancel.  Daniel explained:
> >> 
> >>     migrate_cancel will do a shutdown() on the primary migration socket only.
> >>     In addition it will toggle the migration state.
> >> 
> >>     Yanking will do a shutdown on all migration sockets (important for
> >>     multifd), but won't touch migration state or any other aspect of QEMU
> >>     code.
> >> 
> >>     Overall yanking has less potential for things to go wrong than the
> >>     migrate_cancel method, as it doesn't try to do any kind of cleanup
> >>     or migration.
> >> 
> >> Would it make sense to work this into the documentation?  
> >
> > How about this?
> >
> >   - migration:
> >     Yanking it will shut down all migration connections. Unlike
> >     @migrate_cancel, it will not notify the migration process,
> >     so migration will go into @failed state, instead of @cancelled
> >     state.  
> 
> Works for me.  Advice on when to use it rather than migrate_cancel would
> be nice, though.

Ok, Changed for the next version.

> >> > +#
> >> > +# Since: 5.2
> >> > +##
> >> > +{ 'union': 'YankInstance',
> >> > +  'base': { 'type': 'YankInstanceType' },
> >> > +  'discriminator': 'type',
> >> > +  'data': {
> >> > +      'blockdev': 'YankInstanceBlockdev',
> >> > +      'chardev': 'YankInstanceChardev' } }
> >> > +
> >> > +##
> >> > +# @yank:
> >> > +#
> >> > +# Recover from hanging qemu by yanking the specified instances. See    
> >> 
> >> QEMU
> >> 
> >> "Try to recover" would be more precise, I think.  
> >
> > Changed for the next version.
> >  
> >> > +# "YankInstance" for more information.
> >> > +#
> >> > +# Takes a list of @YankInstance as argument.
> >> > +#
> >> > +# Returns: nothing.
> >> > +#
> >> > +# Example:
> >> > +#
> >> > +# -> { "execute": "yank",
> >> > +#      "arguments": {
> >> > +#          "instances": [
> >> > +#               { "type": "block-node",
> >> > +#                 "node-name": "nbd0" }
> >> > +#          ] } }
> >> > +# <- { "return": {} }
> >> > +#
> >> > +# Since: 5.2
> >> > +##
> >> > +{ 'command': 'yank',
> >> > +  'data': { 'instances': ['YankInstance'] },
> >> > +  'allow-oob': true }
> >> > +  
> [...]
> >> > diff --git a/util/yank.c b/util/yank.c
> >> > new file mode 100644
> >> > index 0000000000..0b3a816706
> >> > --- /dev/null
> >> > +++ b/util/yank.c  
> [...]
> >> > +void qmp_yank(YankInstanceList *instances,
> >> > +              Error **errp)
> >> > +{
> >> > +    YankInstanceList *tail;
> >> > +    YankInstanceEntry *entry;
> >> > +    YankFuncAndParam *func_entry;
> >> > +
> >> > +    qemu_mutex_lock(&yank_lock);
> >> > +    for (tail = instances; tail; tail = tail->next) {
> >> > +        entry = yank_find_entry(tail->value);
> >> > +        if (!entry) {
> >> > +            error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, "Instance not found");    
> >> 
> >> Quote error.h:
> >> 
> >>  * Note: use of error classes other than ERROR_CLASS_GENERIC_ERROR is
> >>  * strongly discouraged.
> >> 
> >> Any particular reason for ERROR_CLASS_DEVICE_NOT_FOUND?  If not, then
> >> error_setg(), please.  
> >
> > There may be a time-to-check to time-to-use race condition here. Suppose
> > the management application (via QMP) calls 'blockev-del nbd0', then QEMU
> > hangs, so after a timeout it calls 'yank nbd0' while shortly before that
> > QEMU unhangs for some reason and removes the blockdev. Then yank returns
> > this error.
> >
> > QMP errors should not be parsed except for the error class, so the the
> > management application can only differentiate between this (ignorable)
> > error and other (possibly fatal) ones by parsing the error class.  
> 
> I see.
> 
> DeviceNotFound doesn't really fit, but none of the other error classes
> is any better.
> 
> I think the line
> 
>       # Returns: nothing.
> 
> in the QAPI schema (quoted above) should be expanded to something like
> 
> 
>       # Returns: - Nothing on success
>                  - If the YankInstance doesn't exist, DeviceNotFound

Changed for the next version.

> >> > +            qemu_mutex_unlock(&yank_lock);
> >> > +            return;
> >> > +        }
> >> > +    }
> >> > +    for (tail = instances; tail; tail = tail->next) {
> >> > +        entry = yank_find_entry(tail->value);
> >> > +        assert(entry);
> >> > +        QLIST_FOREACH(func_entry, &entry->yankfns, next) {
> >> > +            func_entry->func(func_entry->opaque);
> >> > +        }
> >> > +    }
> >> > +    qemu_mutex_unlock(&yank_lock);
> >> > +}  
> [...]
> 



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2020-10-30 15:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 18:45 [PATCH v9 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
2020-10-28 18:45 ` [PATCH v9 1/8] Introduce yank feature Lukas Straub
2020-10-29 16:36   ` Markus Armbruster
2020-10-30 10:32     ` Lukas Straub
2020-10-30 14:02       ` Markus Armbruster
2020-10-30 15:19         ` Lukas Straub [this message]
2020-10-28 18:45 ` [PATCH v9 2/8] block/nbd.c: Add " Lukas Straub
2020-10-28 18:45 ` [PATCH v9 3/8] chardev/char-socket.c: " Lukas Straub
2020-10-28 18:45 ` [PATCH v9 4/8] migration: " Lukas Straub
2020-10-28 18:45 ` [PATCH v9 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe Lukas Straub
2020-10-28 18:45 ` [PATCH v9 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown Lukas Straub
2020-10-28 18:45 ` [PATCH v9 7/8] MAINTAINERS: Add myself as maintainer for yank feature Lukas Straub
2020-10-28 18:45 ` [PATCH v9 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test Lukas Straub

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201030161908.2f71ecd5@luklap \
    --to=lukasstraub2@web.de \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).