Re: [PATCH v6.2 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*

From: Anthony PERARD <anthony.perard@citrix.com>
To: Ian Jackson <ian.jackson@citrix.com>
Cc: xen-devel@lists.xenproject.org, Wei Liu <wei.liu2@citrix.com>
Subject: Re: [PATCH v6.2 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*
Date: Fri, 16 Nov 2018 17:10:46 +0000	[thread overview]
Message-ID: <20181116171046.GP1302@perard.uk.xensource.com> (raw)
In-Reply-To: <23534.60319.808138.50737@mariner.uk.xensource.com>

On Fri, Nov 16, 2018 at 04:09:03PM +0000, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [PATCH v6.2 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*"):
> > On Thu, Nov 15, 2018 at 06:40:13PM +0000, Ian Jackson wrote:
> > > It would be better if the id column stated something more useful than
> > > `set'.  `next' maybe, where applicable ?
> > 
> > When 'set', id is the id of the user command we intend to send or have
> > already sent and waiting for the associated reply. I'm not sure which
> > word or set of word would describe enough. I might need to add a comment
> > below the table.
> 
> You could use `sent', `next', alongside your existing `prev' ?

Will do.

> > > > + * - Allowed internal state transition:
> > > > + * disconnected                         -> connecting
> > > > + * connection                           -> capability_negotiation
> > > > + * capability_negotiation/waiting_reply -> connected
> > > > + * connected                            -> waiting_reply
> > > > + * any                                  -> disconnected
> > > 
> > > This does not mention the state `broken' and it should.
> > 
> > Maybe:
> >   any -> broken
> 
> Fine by me.  You want to mention broken -> disconnected too :-).

Will, that part of any->disconnected, but it doesn't hurt to add it.

> > > > +static int qmp_ev_prepare_cmd(libxl__gc *gc,
> > > > +                              libxl__ev_qmp *ev,
> > > > +                              const char *cmd,
> > > > +                              const libxl__json_object *args)
> ...
> > > > +    ev->id++;
> > > > +    buf = qmp_prepare_cmd(gc, cmd, args, ev->id, &len);
> > > > +    if (!buf) {
> > > > +        return ERROR_FAIL;
> > > > +    }
> > > 
> > > There is no doc comment for qmp_prepare_cmd saying whether it logs on
> > > error.  So it doesn't ?  In which case maybe you should do so here.
> > > 
> > > Alternatively if it can only really fail due to memory allocation
> > > failure, maybe it should return void (and call libxl__alloc_failed
> > > on allocation failure) instead ?
> > 
> > So we can have the allocation been done by libxl's functions, as
> > yajl_gen_alloc can use other functions that malloc/free/realloc, that
> > would remove the allocation failure.
> 
> Right.
> 
> > But then I think that qmp_prepare_cmd could fail and return null if
> > 'cmd' or 'args' contains bad json objects, or a wrong string.
> > 
> > So we still needs to detect failure.
> 
> OK.
> 
> > I log here as I can add the domid.
> 
> I'm not sure what you mean.  Where do you log ?  Not here.

Not yet :). There is maybe missing words in my sentence...

> > > > +/* Setup connection */
> > > > +
> > > > +static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev)
> > > > +    /* disconnected -> connecting but with `msg` free
> > > > +     * on error: disconnected
> > > > +     * If the initial state isn't disconnected, then nothing is done */
> > > 
> > > Maybe it be better for this function to return `broken' on error ?
> > 
> > That not the case, but I could change to return broken on error, and let
> > the caller clean the state.
> 
> That's what I mean.  That might simplify this function and the caller
> probably has the relevant cleanup path already.  (I haven't checked.)

It does, and yes, that will make the function simpler.

> > > > +    if (revents & POLLOUT) {
> > > > +        rc = qmp_ev_callback_writable(gc, ev, fd);
> > > ...
> > > > +    if (revents & POLLIN) {
> > > > +        rc = qmp_ev_callback_readable(egc, ev, fd);
> > > > +        if (rc)
> > > > +            goto out;
> > > > +
> > > > +        /* parse input */
> > > 
> > > I find it odd that this input parsing is not part of
> > > qmp_ev_callback_readable.  What do you think about moving it there ?
> > 
> > I wanted to reduce the number of function calls between when a user
> > callback is called and when ev_qmp's control passes outsite of ev_qmp's
> > implementation. So I moved qmp_ev_handle_message() call here.
> 
> Why did you want to reduce the number of function calls ?

Well, there is a need to track that `ev` might be discarded, and the
only way left is via a return value. It is probably easier to follow if
less function have the possibility that a user callback have been
called.

> > > > +    /*
> > > > +     * We will send a file descriptor associated with a command on the
> > > > +     * first byte of this command.
> > > > +     */
> > > > +    if (ev->qmp_state == qmp_state_waiting_reply &&
> > > > +        ev->fd >= 0 &&
> > > > +        ev->tx_buf_off == 0) {
> > > 
> > > According to the doc comments, state waiting_reply might have tx_buf
> > > free, in which case it would try to execute this code.  I don't think
> > > that can be right.
> > 
> > When tx_buf is free, that function returns earlier. So this code isn't
> > executed.
> 
> Oh yes, so it does.  But why are we entering this callback with a free
> tx_buf at all ?  If we just return, we are likely to get called again,
> ie an infinite loop.  This can only occur as a result of a bug ?
> In which case maybe it should be an assert ?

Yes, I can probably add the assert.

> > > I wrote:
> > > 
> > >   But I think you should treat only `\n' as the delimiter.  This will
> > >   considerably simplify the buffer handling.  (You should check and trim
> > >   the preceding `\r' before passing things to libxl__json_parse of
> > >   course.)
> > > 
> > > But you don't seem to have done that, or replied ?
> > 
> > This is still simpler than what we had before, and if I look for only
> > '\n', I would need a loop in cases where \n isn't precedded by \r. I
> > also check the whole buffer rather than new data, so looking for \r and
> > check if it is followed by a \n is similar.
> 
> I think a bare \n is not legal and should be treated as a protocol
> error.  Don't you agree ?  Given that, you can search for \n, and if
> it is not preceded by \r, call it an error.

I can't find anything that say bare \n are not legal. \n is part of
rfc7159, which the qmp-spec mentions when speaking about JSON data
structures. It is even possible to ask QEMU to add bare \n, this is done
with pretty=on.

> > > > +    case LIBXL__QMP_MESSAGE_TYPE_QMP:
> > > ...
> > > > +        assert(!ev->tx_buf);
> > > > +        ev->tx_buf = qmp_prepare_cmd(gc, "qmp_capabilities", NULL,
> > > > +                              QMP_CAPABILITY_NEGOTIATION_MSGID,
> > > > +                              &ev->tx_buf_len);
> > > > +        ev->tx_buf_off = 0;
> > > 
> > > Why do you not use qmp_ev_prepare_cmd for this ?
> > 
> > Because qmp_ev_prepare_cmd does more like increasing the id. This
> > qmp_capabilities is a special command that is part of the
> > initialisation. I need to know which id is used so that I can track it
> > later and found out when QEMU is ready to receive other commands.
> 
> But see below about the id.  I think you should be using the normal
> id.
> 
> In another thread you were talking about possibly trying to reuse an
> existing session and putting the pid in the id or something.  In this
> case a fixed id for the capabilities command is not good either.
> 
> Let me bring forward the part where we handle the resuls:
> 
> > > > +        id = libxl__json_object_get_integer(o);
> > > > +
> > > > +        if (id == QMP_CAPABILITY_NEGOTIATION_MSGID) {
> > > > +            /* We have a response to our qmp_capabilities cmd */
> > > > +            if (ev->qmp_state != qmp_state_capability_negotiation ||
> > > > +                type != LIBXL__QMP_MESSAGE_TYPE_RETURN)
> > > > +                goto out_unknown_id;
> > > > +            qmp_ev_set_state(gc, ev, qmp_state_connected);
> > > > +            return 0;
> > > > +        }
> > > 
> > > I'm a bit puzzled about the capability negotation, its id, etc.:
> > > 
> > > * Why does QMP_CAPABILITY_NEGOTIATION_MSGID exist at all ?  You
> > >   could just use a normal id, couldn't you ?  And you'd be able to
> > >   tell from your own state that it was the right value.
> > 
> > But I would then needs to track two different ids, one for the
> > qmp_capabilities command, and one for the user's command.
> 
> But you only construct the id when you prepare the command into
> tx_buf.  So you only have one id at a time: the id of the most
> recently prepared command.  Or am I wrong ?
> 
> ...
> 
> I have looked at this code again and I was wrong.  You format the
> user's message, with id, right at the beginning.  (The sequencing of
> events in the code structure is slightly odd, in that the capabilities
> message is formatted after the user's message, but sent beforehand.)
> 
> I still think for the reasons above that you probably can't have a
> fixed QMP_CAPABILITY_NEGOTIATION_MSGID because we may want to be able
> to reuse an old transport connection in the future.
> 
> So that means you do need two ids.  (I think you can't defer
> formatting the user's message, baking an id into it, because the
> lifetime of the callers' json object is too short for you to save it.)
> 
> I'm not sure I can see sensible a way of doing this that doesn't have
> *three* id variables:
>   - a counter for generating new ids
>   - the id put in the capabilities command
>   - the id put in the user's command

That sounds fine.

> > We could call qmp_ev_prepare_cmd, qmp_ev_prepare_user_cmd.
> > 
> > > > +    case LIBXL__QMP_MESSAGE_TYPE_RETURN:
> > > > +    case LIBXL__QMP_MESSAGE_TYPE_ERROR:
> > > ...
> > > > +        o = libxl__json_map_get("id", resp, JSON_INTEGER);
> > > > +        if (!o) {
> > > > +            /*
> > > > +             * If "id" isn't present, an error occur on the server before
> > > > +             * it has read the "id" provided by libxl.
> > > > +             */
> > > > +            qmp_ev_parse_error_messages(egc, ev, resp);
> > > > +            return ERROR_PROTOCOL_ERROR_QMP;
> > > 
> > > I think you wanted to return the error code from
> > > qmp_ev_parse_error_messages ?
> > 
> > Yeah, but then I've added to the public documentation that ERROR_QMP_*
> > means that ev_qmp is still Connected.
> 
> Oh.  Right.  I see.  Now that you have explained I think the code is
> correct.  Can you add a comment about this ?  Something like this:
>    /*
>     * Deliberately squash all errors into ERROR_PROTOCOL_ERROR_QMP.
>     * qmp_ev_parse_error_messages will only return that, or
>     * ERROR_QMP_*; but ERROR_QMP_* is reserved for errors resulting
>     * from the caller's command, and when we return that we promise
>     * that the ev_qmp is Connected.
>     */

Will do.

> > > * Why are we even doing capability negotation when we throw the answer
> > >   away ?
> > 
> > It is part of the protocol.
> > 
> > And qmp_capabilities doesn't return anything (just an empty object).
> 
> Oh, is the purpose to inform qemu what our capabilities are ?

It actually inform qemu of the capabilities we want. The QMP server will
not accept any other commands before the client execute qmp_capabilities.

The conversation between QEMU and libxl goes like this:

QEMU: Hi, I'm QEMU 3.0, I can do "oob".
libxl: Hi, I don't need any capabilities.
QEMU: Thanks, you can now run any other command you like.
libxl sends the user command.

Or in term of edited JSON:
QEMU:  { "QMP": {"version": XXX, "capabilities": ["oob"] } }
libxl: { "execute": "qmp_capabilities", "arguments": {} }
QEMU:  { "return": {}}

And here is a link to the example in the QMP spec document
(3. QMP Examples):
https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/qmp-spec.txt;h=8f7da0245d51447be7df2b3d4b105bad9fbec0b3;hb=HEAD#l244

Maybe I should add somewhere in a comment where to find the QMP spec,
even so we already have "This file implement a client for QMP (QEMU
Monitor Protocol). For the Specification, see in the QEMU repository."
at the top of libxl_qmp.c

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel