From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35693) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cRjIt-0001LF-7O for qemu-devel@nongnu.org; Thu, 12 Jan 2017 12:36:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cRjIo-0000lS-Gi for qemu-devel@nongnu.org; Thu, 12 Jan 2017 12:36:47 -0500 Received: from mx5-phx2.redhat.com ([209.132.183.37]:47874) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cRjIo-0000kF-2Z for qemu-devel@nongnu.org; Thu, 12 Jan 2017 12:36:42 -0500 Date: Thu, 12 Jan 2017 12:36:40 -0500 (EST) From: =?utf-8?Q?Marc-Andr=C3=A9?= Lureau Message-ID: <1430656579.1967521.1484242600045.JavaMail.zimbra@redhat.com> In-Reply-To: <871sw8ct9r.fsf@dusky.pond.sub.org> References: <20170109143437.30554-1-marcandre.lureau@redhat.com> <20170109143437.30554-16-marcandre.lureau@redhat.com> <87shopn04s.fsf@dusky.pond.sub.org> <2044540165.1410809.1484151684572.JavaMail.zimbra@redhat.com> <871sw8ct9r.fsf@dusky.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v7 15/21] qapi: add qapi2texi script List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: =?utf-8?Q?Marc-Andr=C3=A9?= Lureau , qemu-devel@nongnu.org Hi ----- Original Message ----- > Marc-Andr=C3=A9 Lureau writes: >=20 > > Hi > > > > ----- Original Message ----- > >> Marc-Andr=C3=A9 Lureau writes: > >>=20 > >> > As the name suggests, the qapi2texi script converts JSON QAPI > >> > description into a texi file suitable for different target > >> > formats (info/man/txt/pdf/html...). > >> > > >> > It parses the following kind of blocks: > >> > > >> > Free-form: > >> > > >> > ## > >> > # =3D Section > >> > # =3D=3D Subsection > >> > # > >> > # Some text foo with *emphasis* > >> > # 1. with a list > >> > # 2. like that > >> > # > >> > # And some code: > >> > # | $ echo foo > >> > # | -> do this > >> > # | <- get that > >> > # > >> > ## > >> > > >> > Symbol description: > >> > > >> > ## > >> > # @symbol: > >> > # > >> > # Symbol body ditto ergo sum. Foo bar > >> > # baz ding. > >> > # > >> > # @param1: the frob to frobnicate > >> > # @param2: #optional how hard to frobnicate > >> > # > >> > # Returns: the frobnicated frob. > >> > # If frob isn't frobnicatable, GenericError. > >> > # > >> > # Since: version > >> > # Notes: notes, comments can have > >> > # - itemized list > >> > # - like this > >> > # > >> > # Example: > >> > # > >> > # -> { "execute": "quit" } > >> > # <- { "return": {} } > >> > # > >> > ## > >> > > >> > That's roughly following the following EBNF grammar: > >> > > >> > api_comment =3D "##\n" comment "##\n" > >> > comment =3D freeform_comment | symbol_comment > >> > freeform_comment =3D { "# " text "\n" | "#\n" } > >> > symbol_comment =3D "# @" name ":\n" { member | tag_section | > >> > freeform_comment > >> > } > >> > member =3D "# @" name ':' [ text ] "\n" freeform_comment > >> > tag_section =3D "# " ( "Returns:", "Since:", "Note:", "Notes:", > >> > "Example:", > >> > "Examples:" ) [ text ] "\n" freeform_comment > >> > text =3D free text with markup > >> > > >> > Note that the grammar is ambiguous: a line "# @foo:\n" can be parsed > >> > both as freeform_comment and as symbol_comment. The actual parser > >> > recognizes symbol_comment. > >> > > >> > See docs/qapi-code-gen.txt for more details. > >> > > >> > Deficiencies: > >>=20 > >> Perhaps "Deficiencies and limitations". > > > > ok > > > >>=20 > >> > - the generated QMP documentation includes internal types > >> > - union-type support is lacking > >>=20 > >> "union type" (no dash). > >>=20 > >> > - type information is lacking in generated documentation > >> > - doc comment error message positions are imprecise, they point > >> > to the beginning of the comment. > >> > - see other TODO/FIXME in this commit > >>=20 > >> Suggest: - a few minor issues, all marked TODO/FIXME in the code > >>=20 > > > > ok > > > >> > > >> > Signed-off-by: Marc-Andr=C3=A9 Lureau > >> > --- > >> [diffstat snipped...] > >> > diff --git a/scripts/qapi.py b/scripts/qapi.py > >> > index 3d5f9e1eaf..a92a86f428 100644 > >> > --- a/scripts/qapi.py > >> > +++ b/scripts/qapi.py > >> > @@ -125,6 +125,122 @@ class QAPISemError(QAPIError): > >> > info['parent'], msg) > >> > =20 > >> > =20 > >> > +class QAPIDoc(object): > >> > + class Section(object): > >> > + def __init__(self, name=3DNone): > >> > + # optional section name (argument/member or section nam= e) > >> > + self.name =3D name > >> > + # the list of lines for this section > >> > + self.content =3D [] > >> > + > >> > + def append(self, line): > >> > + self.content.append(line) > >> > + > >> > + def __repr__(self): > >> > + return "\n".join(self.content).strip() > >> > + > >> > + class ArgSection(Section): > >> > + pass > >> > + > >> > + def __init__(self, parser, info): > >> > + # self.parser is used to report errors with QAPIParseError. > >> > The > >> > + # resulting error position depends on the state of the pars= er. > >> > + # It happens to be the beginning of the comment. More or l= ess > >> > + # servicable, but action at a distance. > >> > + self.parser =3D parser > >> > + self.info =3D info > >> > + self.symbol =3D None > >> > + self.body =3D QAPIDoc.Section() > >> > + # dict mapping parameter name to ArgSection > >> > + self.args =3D OrderedDict() > >> > + # a list of Section > >> > + self.sections =3D [] > >> > + # the current section > >> > + self.section =3D self.body > >> > + # associated expression (to be set by expression parser) > >> > + self.expr =3D None > >> > + > >> > + def has_section(self, name): > >> > + """Return True if we have a section with this name.""" > >> > + for i in self.sections: > >> > + if i.name =3D=3D name: > >> > + return True > >> > + return False > >> > + > >> > + def append(self, line): > >> > + """Parse a comment line and add it to the documentation.""" > >> > + line =3D line[1:] > >> > + if not line: > >> > + self._append_freeform(line) > >> > + return > >> > + > >> > + if line[0] !=3D ' ': > >> > + raise QAPIParseError(self.parser, "Missing space after = #") > >> > + line =3D line[1:] > >> > + > >> > + # FIXME not nice: things like '# @foo:' and '# @foo: ' are= n't > >> > + # recognized, and get silently treated as ordinary text > >> > + if self.symbol: > >> > + self._append_symbol_line(line) > >> > + elif not self.body.content and line.startswith("@"): > >> > + if not line.endswith(":"): > >> > + raise QAPIParseError(self.parser, "Line should end = with > >> > :") > >> > + self.symbol =3D line[1:-1] > >> > + # FIXME invalid names other than the empty string aren'= t > >> > flagged > >> > + if not self.symbol: > >> > + raise QAPIParseError(self.parser, "Invalid name") > >> > + else: > >> > + self._append_freeform(line) > >> > + > >> > + def _append_symbol_line(self, line): > >> > + name =3D line.split(' ', 1)[0] > >> > + > >> > + if name.startswith("@") and name.endswith(":"): > >> > + line =3D line[len(name)+1:] > >> > + self._start_args_section(name[1:-1]) > >> > + elif name in ("Returns:", "Since:", > >> > + # those are often singular or plural > >> > + "Note:", "Notes:", > >> > + "Example:", "Examples:", > >> > + "TODO:"): > >> > + line =3D line[len(name)+1:] > >> > + self._start_section(name[:-1]) > >> > + > >> > + self._append_freeform(line) > >> > + > >> > + def _start_args_section(self, name): > >> > + # FIXME invalid names other than the empty string aren't > >> > flagged > >> > + if not name: > >> > + raise QAPIParseError(self.parser, "Invalid parameter na= me") > >> > + if name in self.args: > >> > + raise QAPIParseError(self.parser, > >> > + "'%s' parameter name duplicated" % > >> > name) > >> > + if self.sections: > >> > + raise QAPIParseError(self.parser, > >> > + "'%s' parameter documentation is > >> > interleaved " > >> > + "with other sections" % name) > >>=20 > >> This error message is rather confusing. Ideally, we'd point to the th= e > >> source code that made us add to self.sections, but that's not in the > >> cards right now. Here's my best try: > >>=20 > >> raise QAPIParseError(self.parser, > >> "'@%s:' can't follow '%s' section" > >> % (name, self.sections[0].name)) > >>=20 > > > > works for me too > > > >> > + self.section =3D QAPIDoc.ArgSection(name) > >> > + self.args[name] =3D self.section > >> > + > >> > + def _start_section(self, name=3D""): > >>=20 > >> Note: @name changed to default to "" since v6 for the benefit of > >> _append_freeform() below. > >>=20 > >> > + if name in ("Returns", "Since") and self.has_section(name): > >> > + raise QAPIParseError(self.parser, > >> > + "Duplicated '%s' section" % name) > >> > + self.section =3D QAPIDoc.Section(name) > >> > + self.sections.append(self.section) > >> > + > >> > + def _append_freeform(self, line): > >> > + in_arg =3D isinstance(self.section, QAPIDoc.ArgSection) > >> > + if (in_arg and self.section.content and > >> > + not self.section.content[-1] and > >> > + line and not line[0].isspace()): > >>=20 > >> PEP8 recommends breaking lines before operators in new code. > > > > ok > > > >>=20 > >> > + self._start_section() > >>=20 > >> Changed from v6's > >>=20 > >> raise QAPIParseError(self.parser, "Invalid section > >> indentation") > >>=20 > >> May well be an improvement (I'm certainly no fan of this error myself)= , > >> but it throws my review off the fast track: now I have to figure out h= ow > >> the new code works. Can you give me a hint on what this change does? > > > > In v6, we rejected unindented sections after arguments (see > > tests/qapi-schema/doc-bad-section.json): > > > > # @arg: blah bla > > # zing bar > > # > > # This should be rejected > > > > And a patch in v6 reordered the json comments. But you asked for the > > doc to not be reordered without careful review (both source and > > output). So those additional paragraphs are added as untitled > > sections, and the doc generator keep section order. The output will be > > in the same order as the source. >=20 > Thanks. >=20 > >> Or should we take a shortcut and revert to v6 here? We can always > >> improve on top. > > > > Probably not needed to revert. >=20 > I played with it a bit, and it looks like it's working fine. >=20 > >> > + if (in_arg or not self.section.name or > >> > + not self.section.name.startswith("Example")): > >>=20 > >> Again, break before the operator. > > > > ok > > > >>=20 > >> > + line =3D line.strip() > >> > + self.section.append(line) > >> > + > >> > + > >> > class QAPISchemaParser(object): > >> > =20 > >> > def __init__(self, fp, previously_included=3D[], incl_info=3DNo= ne): > >> > @@ -140,11 +256,17 @@ class QAPISchemaParser(object): > >> > self.line =3D 1 > >> > self.line_pos =3D 0 > >> > self.exprs =3D [] > >> > + self.docs =3D [] > >> > self.accept() > >> > =20 > >> > while self.tok is not None: > >> > info =3D {'file': fname, 'line': self.line, > >> > 'parent': self.incl_info} > >> > + if self.tok =3D=3D '#': > >> > + doc =3D self.get_doc(info) > >> > + self.docs.append(doc) > >> > + continue > >> > + > >> > expr =3D self.get_expr(False) > >> > if isinstance(expr, dict) and "include" in expr: > >> > if len(expr) !=3D 1: > >> > @@ -162,6 +284,7 @@ class QAPISchemaParser(object): > >> > raise QAPISemError(info, "Inclusion loop fo= r > >> > %s" > >> > % include) > >> > inf =3D inf['parent'] > >> > + > >> > # skip multiple include of the same file > >> > if incl_abs_fname in previously_included: > >> > continue > >> > @@ -172,12 +295,19 @@ class QAPISchemaParser(object): > >> > exprs_include =3D QAPISchemaParser(fobj, > >> > previously_included, > >> > info) > >> > self.exprs.extend(exprs_include.exprs) > >> > + self.docs.extend(exprs_include.docs) > >> > else: > >> > expr_elem =3D {'expr': expr, > >> > 'info': info} > >> > + if (self.docs and > >> > + self.docs[-1].info['file'] =3D=3D fname and > >> > + not self.docs[-1].expr): > >>=20 > >> Again, break before the operator. > > > > ok > > > >>=20 > >> > + self.docs[-1].expr =3D expr > >> > + expr_elem['doc'] =3D self.docs[-1] > >> > + > >> > self.exprs.append(expr_elem) > >> > =20 > >> > - def accept(self): > >> > + def accept(self, skip_comment=3DTrue): > >> > while True: > >> > self.tok =3D self.src[self.cursor] > >> > self.pos =3D self.cursor > >> > @@ -185,7 +315,13 @@ class QAPISchemaParser(object): > >> > self.val =3D None > >> > =20 > >> > if self.tok =3D=3D '#': > >> > + if self.src[self.cursor] =3D=3D '#': > >> > + # Start of doc comment > >> > + skip_comment =3D False > >> > self.cursor =3D self.src.find('\n', self.cursor) > >> > + if not skip_comment: > >> > + self.val =3D self.src[self.pos:self.cursor] > >> > + return > >> > elif self.tok in "{}:,[]": > >> > return > >> > elif self.tok =3D=3D "'": > >>=20 > >> Copied from review of v3, so I don't forget: > >>=20 > >> Comment tokens are thrown away as before, except when the parser asks > >> for them by passing skip_comment=3DFalse, or when the comment token st= arts > >> with ##. The parser asks while parsing a doc comment, in get_doc(). > >>=20 > >> This is a backchannel from the parser to the lexer. I'd rather avoid > >> such lexer hacks, but I guess we can address that on top. > >>=20 > >> A comment starting with ## inside an expression is now a syntax error. > >> For instance, input > >>=20 > >> { > >> ## > >>=20 > >> yields > >>=20 > >> /dev/stdin:2:1: Expected string or "}" > >>=20 > >> Rather unfriendly error message, but we can fix that on top. > >>=20 > >> > @@ -319,6 +455,28 @@ class QAPISchemaParser(object): > >> > raise QAPIParseError(self, 'Expected "{", "[" or string= ') > >> > return expr > >> > =20 > >> > + def get_doc(self, info): > >> > + if self.val !=3D '##': > >> > + raise QAPIParseError(self, "Junk after '##' at start of= " > >> > + "documentation comment") > >> > + > >> > + doc =3D QAPIDoc(self, info) > >> > + self.accept(False) > >> > + while self.tok =3D=3D '#': > >> > + if self.val.startswith('##'): > >> > + # End of doc comment > >> > + if self.val !=3D '##': > >> > + raise QAPIParseError(self, "Junk after '##' at = end > >> > of > >> > " > >> > + "documentation comment") > >> > + self.accept() > >> > + return doc > >> > + else: > >> > + doc.append(self.val) > >> > + self.accept(False) > >> > + > >> > + raise QAPIParseError(self, "Documentation comment must end = with > >> > '##'") > >> > + > >> > + > >> > # > >> > # Semantic analysis of schema expressions > >> > # TODO fold into QAPISchema > >> > @@ -703,6 +861,11 @@ def check_exprs(exprs): > >> > for expr_elem in exprs: > >> > expr =3D expr_elem['expr'] > >> > info =3D expr_elem['info'] > >> > + > >> > + if 'doc' not in expr_elem: > >> > + raise QAPISemError(info, > >> > + "Expression missing documentation > >> > comment") > >> > + > >> > if 'enum' in expr: > >> > check_keys(expr_elem, 'enum', ['data'], ['prefix']) > >> > add_enum(expr['enum'], info, expr['data']) > >> > @@ -761,6 +924,84 @@ def check_exprs(exprs): > >> > return exprs > >> > =20 > >> > =20 > >> > +def check_freeform_doc(doc): > >> > + if doc.symbol: > >> > + raise QAPISemError(doc.info, > >> > + "Documention for '%s' is not followed" > >> > + " by the definition" % doc.symbol) > >> > + > >> > + body =3D str(doc.body) > >> > + if re.search(r'@\S+:', body, re.MULTILINE): > >> > + raise QAPISemError(doc.info, > >> > + "Free-form documentation block must not > >> > contain" > >> > + " @NAME: sections") > >> > + > >> > + > >> > +def check_definition_doc(doc, expr, info): > >>=20 > >> Lots of churn in this function. It can certainly use improvement, as > >> pointed out in prior reviews, but it triggers still more review. Base= d > >> on the tests, I guess it's for finding documented parameters that don'= t > >> actually exist, and to diagnose optional mismatch. Anything else? > >>=20 > > > > implicitely adds #optional (since it's no longer in type info, it bette= r be > > in the description), code simplification, message improvement too. > > > > I'd rather keep this too, and improve on top. >=20 > Okay. Review below. >=20 > >> Or should we take a shortcut and revert to v6 here? We can always > >> improve on top. I think v6 would need a # TODO Should ensure #optiona= l > >> matches the schema. > >>=20 > >> > + for i in ('enum', 'union', 'alternate', 'struct', 'command', > >> > 'event'): > >> > + if i in expr: > >> > + meta =3D i > >> > + break > >> > + > >> > + name =3D expr[meta] > >> > + if doc.symbol !=3D name: > >> > + raise QAPISemError(info, "Definition of '%s' follows > >> > documentation" > >> > + " for '%s'" % (name, doc.symbol)) > >> > + if doc.has_section('Returns') and 'command' not in expr: > >> > + raise QAPISemError(info, "'Returns:' is only valid for > >> > commands") > >>=20 > >> Copied from review of v5, so I don't forget: > >>=20 > >> We accept 'Returns:' even when the command doesn't return anything, > >> because we currently use it to document errors, too. Can't say I like > >> that, but it's out of scope here. > >>=20 > >> > + > >> > + if meta =3D=3D 'union': > >> > + args =3D expr.get('base', []) >=20 > Note: args is either string or dictionary. >=20 > >> > + else: > >> > + args =3D expr.get('data', []) >=20 > Note: args is either string (only if command or event), dictionary (only > if struct, alternate, command, event) or list (only if enum) >=20 > >> > + if isinstance(args, dict): > >> > + args =3D args.keys() >=20 > Now args is either string (only if command or event) or list. >=20 > >> > + > >> > + if meta =3D=3D 'alternate' or \ > >> > + (meta =3D=3D 'union' and not expr.get('discriminator')): >=20 > PEP8 prefers parenthesis over backslash, and recommends breaking lines > before instead of after binary operators: ok >=20 > if (meta =3D=3D 'alternate' > or (meta =3D=3D 'union' and not expr.get('discriminator')))= : >=20 > Note: args can only be list here. >=20 > >> > + args.append('type') >=20 > Works because we know it's a list. It's less than obvious, though. >=20 > >> > + > >> > + if isinstance(args, list): > >> > + for arg in args: > >> > + if arg[0] =3D=3D '*': > >> > + opt =3D True > >> > + desc =3D doc.args.get(arg[1:]) > >> > + else: > >> > + opt =3D False > >> > + desc =3D doc.args.get(arg) > >> > + if not desc: > >> > + continue > >> > + desc_opt =3D "#optional" in str(desc) > >> > + if desc_opt and not opt: > >> > + raise QAPISemError(info, "Description has #optional= , " > >> > + "but the declaration doesn't") > >> > + if not desc_opt and opt: > >> > + # silently fix the doc > >> > + desc.append("#optional") >=20 > #optional is redundant information. But as long as we use it, it better > be correct, to avoid misleading readers of the schema. Two sides of > correctness: it's there when arg is optional, and it's not there when > it's not. You enforce the latter, but you silently fix the former. I > figure you do that because #optional is currently missing in quite a few > places. >=20 > Let's add > # TODO either fix the schema and make this an error= , > # or drop #optional entirely >=20 ok > >> > + > >> > + doc_args =3D set(doc.args.keys()) > >> > + args =3D set([name.strip('*') for name in args]) >=20 > Now args morphs from string or dict to set. I'd use a new variable. >=20 > If args is a string, it becomes a set of characters, which is not what > you want. For instance, qapi-schema.json's QCryptoBlockOpenOptions has > 'base': 'QCryptoBlockOptionsBase', and args changes from > 'QCryptoBlockOptionsBase' to set(['O', 'a', 'C', 'B', 'e', 'i', 'c', > 'k', 'l', 'o', 'n', 'Q', 'p', 's', 'r', 't', 'y']) here. >=20 > >> > + if not doc_args.issubset(args): > >> > + raise QAPISemError(info, "The following documented members = are > >> > not in " > >> > + "the declaration: %s" % ", ".join(doc_ar= gs - > >> > args)) >=20 > If args is originally a string, then we happily accept documentation for > its characters. Quite unlikely to bite in practice, but it's wrong > anyways. >=20 > I suspect you could save yourself some grief by returning early when > args is string, say: >=20 > if meta =3D=3D 'union': > args =3D expr.get('base', []) > else: > args =3D expr.get('data', []) > if isinstance(args, str): > return > if isinstance(args, dict): > args =3D args.keys() > assert isinstance(args, list) >=20 > if (meta =3D=3D 'alternate' > or (meta =3D=3D 'union' and not expr.get('discriminator'))): > args.append('type') >=20 > for arg in args: > ... ok >=20 > >> Copied from review of v5, so I don't forget: > >>=20 > >> As explained in review of v3, this is only a subset of the real set of > >> members. Computing the exact set is impractical when working with the > >> abstract syntax tree. I believe we'll eventually have to rewrite this > >> code to work with the QAPISchemaEntity instead. > >>=20 > >> > + > >> > + > >> > +def check_docs(docs): > >> > + for doc in docs: > >> > + for section in doc.args.values() + doc.sections: > >> > + content =3D str(section) > >> > + if not content or content.isspace(): > >> > + raise QAPISemError(doc.info, > >> > + "Empty doc section '%s'" % > >> > section.name) > >> > + > >> > + if not doc.expr: > >> > + check_freeform_doc(doc) > >> > + else: > >> > + check_definition_doc(doc, doc.expr, doc.info) > >> > + > >> > + return docs > >> > + > >> > + > >> > # > >> > # Schema compiler frontend > >> > # > >> > @@ -1229,7 +1470,9 @@ class QAPISchemaEvent(QAPISchemaEntity): > >> > class QAPISchema(object): > >> > def __init__(self, fname): > >> > try: > >> > - self.exprs =3D check_exprs(QAPISchemaParser(open(fname, > >> > "r")).exprs) > >> > + parser =3D QAPISchemaParser(open(fname, "r")) > >> > + self.exprs =3D check_exprs(parser.exprs) > >> > + self.docs =3D check_docs(parser.docs) > >> > self._entity_dict =3D {} > >> > self._predefining =3D True > >> > self._def_predefineds() > >> > diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py > >> > new file mode 100755 > >> > index 0000000000..436749ec7b > >> > --- /dev/null > >> > +++ b/scripts/qapi2texi.py > >> > @@ -0,0 +1,266 @@ > >> > +#!/usr/bin/env python > >> > +# QAPI texi generator > >> > +# > >> > +# This work is licensed under the terms of the GNU LGPL, version 2+= . > >> > +# See the COPYING file in the top-level directory. > >> > +"""This script produces the documentation of a qapi schema in texin= fo > >> > format""" > >> > +import re > >> > +import sys > >> > + > >> > +import qapi > >> > + > >> > +COMMAND_FMT =3D """ > >> > +@deftypefn {type} {{}} {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftypefn > >> > + > >> > +""".format > >> > + > >> > +ENUM_FMT =3D """ > >> > +@deftp Enum {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftp > >> > + > >> > +""".format > >> > + > >> > +STRUCT_FMT =3D """ > >> > +@deftp {{{type}}} {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftp > >> > + > >> > +""".format > >> > + > >> > +EXAMPLE_FMT =3D """@example > >> > +{code} > >> > +@end example > >> > +""".format > >> > + > >> > + > >> > +def subst_strong(doc): > >> > + """Replaces *foo* by @strong{foo}""" > >> > + return re.sub(r'\*([^*\n]+)\*', r'@emph{\1}', doc) > >> > + > >> > + > >> > +def subst_emph(doc): > >> > + """Replaces _foo_ by @emph{foo}""" > >> > + return re.sub(r'\b_([^_\n]+)_\b', r' @emph{\1} ', doc) > >> > + > >> > + > >> > +def subst_vars(doc): > >> > + """Replaces @var by @code{var}""" > >> > + return re.sub(r'@([\w-]+)', r'@code{\1}', doc) > >> > + > >> > + > >> > +def subst_braces(doc): > >> > + """Replaces {} with @{ @}""" > >> > + return doc.replace("{", "@{").replace("}", "@}") > >> > + > >> > + > >> > +def texi_example(doc): > >> > + """Format @example""" > >> > + # TODO: Neglects to escape @ characters. > >> > + # We should probably escape them in subst_braces(), and rename = the > >> > + # function to subst_special() or subs_texi_special(). If we do > >> > that, > >> > we > >> > + # need to delay it until after subst_vars() in texi_format(). > >> > + doc =3D subst_braces(doc).strip('\n') > >> > + return EXAMPLE_FMT(code=3Ddoc) > >> > + > >> > + > >> > +def texi_format(doc): > >> > + """ > >> > + Format documentation > >> > + > >> > + Lines starting with: > >> > + - |: generates an @example > >> > + - =3D: generates @section > >> > + - =3D=3D: generates @subsection > >> > + - 1. or 1): generates an @enumerate @item > >> > + - */-: generates an @itemize list > >> > + """ > >> > + lines =3D [] > >> > + doc =3D subst_braces(doc) > >> > + doc =3D subst_vars(doc) > >> > + doc =3D subst_emph(doc) > >> > + doc =3D subst_strong(doc) > >> > + inlist =3D "" > >> > + lastempty =3D False > >> > + for line in doc.split('\n'): > >> > + empty =3D line =3D=3D "" > >> > + > >> > + # FIXME: Doing this in a single if / elif chain is > >> > + # problematic. For instance, a line without markup termina= tes > >> > + # a list if it follows a blank line (reaches the final elif= ), > >> > + # but a line with some *other* markup, such as a =3D title > >> > + # doesn't. > >> > + if line.startswith("| "): > >> > + line =3D EXAMPLE_FMT(code=3Dline[2:]) > >> > + elif line.startswith("=3D "): > >> > + line =3D "@section " + line[2:] > >> > + elif line.startswith("=3D=3D "): > >> > + line =3D "@subsection " + line[3:] > >> > + elif re.match("^([0-9]*[.]) ", line): > >>=20 > >> [.] is an unusual way to say \. > > > > Right, forgot to change that when removing ) from v6. > > > >>=20 > >> Perhaps we should consistently use r"..." for strings used as regular > >> expressions. > >>=20 > > > > No idea, could be changed on top. >=20 > Yes. But if you respin anyway, adding some r wouldn't hurt. ok >=20 > >> > + if not inlist: > >> > + lines.append("@enumerate") > >> > + inlist =3D "enumerate" > >> > + line =3D line[line.find(" ")+1:] > >> > + lines.append("@item") > >> > + elif re.match("^[*-] ", line): > >> > + if not inlist: > >> > + lines.append("@itemize %s" % {'*': "@bullet", > >> > + '-': "@minus"}[line[0= ]]) > >> > + inlist =3D "itemize" > >> > + lines.append("@item") > >> > + line =3D line[2:] > >> > + elif lastempty and inlist: > >> > + lines.append("@end %s\n" % inlist) > >> > + inlist =3D "" > >> > + > >> > + lastempty =3D empty > >> > + lines.append(line) > >> > + > >> > + if inlist: > >> > + lines.append("@end %s\n" % inlist) > >> > + return "\n".join(lines) > >> > + > >> > + > >> > +def texi_body(doc): > >> > + """ > >> > + Format the body of a symbol documentation: > >> > + - main body > >> > + - table of arguments > >> > + - followed by "Returns/Notes/Since/Example" sections > >> > + """ > >> > + body =3D texi_format(str(doc.body)) + "\n" > >> > + if doc.args: > >> > + body +=3D "@table @asis\n" > >> > + for arg, section in doc.args.iteritems(): > >> > + desc =3D str(section) > >> > + opt =3D '' > >> > + if "#optional" in desc: > >> > + desc =3D desc.replace("#optional", "") > >> > + opt =3D ' (optional)' > >> > + body +=3D "@item @code{'%s'}%s\n%s\n" % (arg, opt, > >> > + texi_format(desc= )) > >> > + body +=3D "@end table\n" > >> > + > >> > + for section in doc.sections: > >> > + name, doc =3D (section.name, str(section)) > >> > + func =3D texi_format > >> > + if name.startswith("Example"): > >> > + func =3D texi_example > >> > + > >> > + if name: > >>=20 > >> @quotation produces confusing .txt and .html output, as discussed in > >> review of v6. Either drop it, or record the problem in a comment, e.g= . > >>=20 > >> # FIXME the indentation produced by @quotation in .txt = and > >> # .html output is confusing > >>=20 > > > > added >=20 > Thanks. >=20 > >> > + body +=3D "\n@quotation %s\n%s\n@end quotation" % \ > >> > + (name, func(doc)) > >> > + else: > >> > + body +=3D func(doc) > >> > + > >> > + return body > >> > + > >> > + > >> > +def texi_alternate(expr, doc): > >> > + """Format an alternate to texi""" > >> > + body =3D texi_body(doc) > >> > + return STRUCT_FMT(type=3D"Alternate", > >> > + name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_union(expr, doc): > >> > + """Format a union to texi""" > >> > + discriminator =3D expr.get("discriminator") > >> > + if discriminator: > >> > + union =3D "Flat Union" > >> > + else: > >> > + union =3D "Simple Union" > >>=20 > >> Not sure flat vs. simple matters for users of generated documentation, > >> but let's not worry about that now. > >>=20 > >> > + > >> > + body =3D texi_body(doc) > >> > + return STRUCT_FMT(type=3Dunion, > >> > + name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_enum(expr, doc): > >> > + """Format an enum to texi""" > >> > + for i in expr['data']: > >> > + if i not in doc.args: > >> > + doc.args[i] =3D '' > >> > + body =3D texi_body(doc) > >> > + return ENUM_FMT(name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_struct(expr, doc): > >> > + """Format a struct to texi""" > >> > + body =3D texi_body(doc) > >> > + return STRUCT_FMT(type=3D"Struct", > >> > + name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_command(expr, doc): > >> > + """Format a command to texi""" > >> > + body =3D texi_body(doc) > >> > + return COMMAND_FMT(type=3D"Command", > >> > + name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_event(expr, doc): > >> > + """Format an event to texi""" > >> > + body =3D texi_body(doc) > >> > + return COMMAND_FMT(type=3D"Event", > >> > + name=3Ddoc.symbol, > >> > + body=3Dbody) > >> > + > >> > + > >> > +def texi_expr(expr, doc): > >> > + """Format an expr to texi""" > >> > + (kind, _) =3D expr.items()[0] > >> > + > >> > + fmt =3D {"command": texi_command, > >> > + "struct": texi_struct, > >> > + "enum": texi_enum, > >> > + "union": texi_union, > >> > + "alternate": texi_alternate, > >> > + "event": texi_event}[kind] > >> > + > >> > + return fmt(expr, doc) > >> > + > >> > + > >> > +def texi(docs): > >> > + """Convert QAPI schema expressions to texi documentation""" > >> > + res =3D [] > >> > + for doc in docs: > >> > + expr =3D doc.expr > >> > + if not expr: > >> > + res.append(texi_body(doc)) > >> > + continue > >> > + try: > >> > + doc =3D texi_expr(expr, doc) > >> > + res.append(doc) > >> > + except: > >> > + print >>sys.stderr, "error at @%s" % doc.info > >> > + raise > >> > + > >> > + return '\n'.join(res) > >> > + > >> > + > >> > +def main(argv): > >> > + """Takes schema argument, prints result to stdout""" > >> > + if len(argv) !=3D 2: > >> > + print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" % > >> > argv[0] > >> > + sys.exit(1) > >> > + > >> > + schema =3D qapi.QAPISchema(argv[1]) > >> > + print texi(schema.docs) > >> > + > >> > + > >> > +if __name__ =3D=3D "__main__": > >> > + main(sys.argv) > >> > diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt > >> > index 2841c5144a..b29f996fdc 100644 > >> > --- a/docs/qapi-code-gen.txt > >> > +++ b/docs/qapi-code-gen.txt > >> > @@ -44,40 +44,148 @@ Input must be ASCII (although QMP supports full > >> > Unicode strings, the > >> > QAPI parser does not). At present, there is no place where a QAPI > >> > schema requires the use of JSON numbers or null. > >> > =20 > >> > + > >> > +=3D=3D=3D Comments =3D=3D=3D > >> > + > >> > Comments are allowed; anything between an unquoted # and the follow= ing > >> > -newline is ignored. Although there is not yet a documentation > >> > -generator, a form of stylized comments has developed for consistent= ly > >> > -documenting details about an expression and when it was added to th= e > >> > -schema. The documentation is delimited between two lines of ##, th= en > >> > -the first line names the expression, an optional overview is provid= ed, > >> > -then individual documentation about each member of 'data' is provid= ed, > >> > -and finally, a 'Since: x.y.z' tag lists the release that introduced > >> > -the expression. Optional members are tagged with the phrase > >> > -'#optional', often with their default value; and extensions added > >> > -after the expression was first released are also given a '(since > >> > -x.y.z)' comment. For example: > >> > - > >> > - ## > >> > - # @BlockStats: > >> > - # > >> > - # Statistics of a virtual block device or a block backing devic= e. > >> > - # > >> > - # @device: #optional If the stats are for a virtual block devic= e, > >> > the > >> > name > >> > - # corresponding to the virtual block device. > >> > - # > >> > - # @stats: A @BlockDeviceStats for the device. > >> > - # > >> > - # @parent: #optional This describes the file block device if it= has > >> > one. > >> > - # > >> > - # @backing: #optional This describes the backing block device i= f it > >> > has one. > >> > - # (Since 2.0) > >> > - # > >> > - # Since: 0.14.0 > >> > - ## > >> > - { 'struct': 'BlockStats', > >> > - 'data': {'*device': 'str', 'stats': 'BlockDeviceStats', > >> > - '*parent': 'BlockStats', > >> > - '*backing': 'BlockStats'} } > >> > +newline is ignored. > >> > + > >> > +A multi-line comment that starts and ends with a '##' line is a > >> > +documentation comment. These are parsed by the documentation > >> > +generator, which recognizes certain markup detailed below. > >> > + > >> > + > >> > +=3D=3D=3D=3D Documentation markup =3D=3D=3D=3D > >> > + > >> > +Comment text starting with '=3D' is a section title: > >> > + > >> > + # =3D Section title > >> > + > >> > +Double the '=3D' for a subsection title: > >> > + > >> > + # =3D=3D Subection title > >> > + > >> > +'|' denotes examples: > >> > + > >> > + # | Text of the example, may span > >> > + # | multiple lines > >> > + > >> > +'*' starts an itemized list: > >> > + > >> > + # * First item, may span > >> > + # multiple lines > >> > + # * Second item > >> > + > >> > +You can also use '-' instead of '*'. > >> > + > >> > +A decimal number followed by '.' starts a numbered list: > >> > + > >> > + # 1. First item, may span > >> > + # multiple lines >=20 > Should the second line be indented? ok >=20 > >> > + # 2. Second item > >> > + > >> > +The actual number doesn't matter. You could even use '*' instead o= f > >> > +'2.' for the second item. > >> > + > >> > +FIXME what exactly ends a list > >>=20 > >> Can you say offhand what ends a list? If yes, can we resolve this FIX= ME > >> now rather than later? > > > > well, the code is too permissive, there is a FIXME there already. > > > > In general, I think we want to end a list after a blank line and a > > "non-item" line. Is the wording ok? >=20 > Blank line is pretty restrictive, because list items consisting of > multiple paragraphs aren't possible then. >=20 > I think writing something like this is fairly natural: >=20 > * First item, may span > multiple lines. >=20 > Even multiple paragaphs are okay. >=20 > * Second item >=20 > - This is a sublist > - Second item of sublist >=20 > * Third item >=20 > This sentence is not part of the list. >=20 > We don't support sublists, and I'm not saying we should, I'm just > showing how they would fit in. >=20 > My point is that indentation is a natural way to delimit list items. >=20 > I don't expect you to implement this now. All I want for this series is > reasonable documentation. As far as I can tell, texi_format() ends the > list at a blank line, except when it doesn't (see its FIXME). >=20 > What about this: replace "FIXME what exactly ends a list" by >=20 > Lists can't be nested. Blank lines are currently not supported > within lists. >=20 > and add to texi_format()'s FIXME something like >=20 > Make sure to update section "Documentation markup" in > docs/qapi-code-gen.txt when fixing this. >=20 > Okay? ok >=20 > >> > + > >> > +Additional whitespace between the initial '#' and the comment text = is > >> > +permitted. > >> > + > >> > +*foo* and _foo_ are for strong and emphasis styles respectively (th= ey > >> > +do not work over multiple lines). @foo is used to reference a name = in > >> > +the schema. > >> > + > >> > +Example: > >> > + > >> > +## > >> > +# =3D Section > >> > +# =3D=3D Subsection > >> > +# > >> > +# Some text foo with *strong* and _emphasis_ > >> > +# 1. with a list > >> > +# 2. like that > >> > +# > >> > +# And some code: > >> > +# | $ echo foo > >> > +# | -> do this > >> > +# | <- get that > >> > +# > >> > +## > >> > + > >> > + > >> > +=3D=3D=3D=3D Expression documentation =3D=3D=3D=3D > >> > + > >> > +Each expression that isn't an include directive must be preceded by= a > >> > +documentation block. Such blocks are called expression documentati= on > >> > +blocks. > >> > + > >> > +The first line of the documentation names the expression, then the > >> > +documentation body is provided, then individual documentation about > >> > +each member of 'data' is provided. Finally, several tagged sections > >> > +can be added. > >>=20 > >> In my review of v6, I wrote: > >>=20 > >> I'm afraid this is more aspiration than specification: the parser > >> accepts these things in almost any order. > >>=20 > >> Could be filed under "deficiencies" for now. > >>=20 > >> To file under "deficiencies", we need to add a FIXME or TODO either > >> here, or in the parser. Assuming you haven't tightened the parser > >> meanwhile (hope you haven't; we need to control the churn). > > > > ok > > > >>=20 > >> Member of 'data' won't remain accurate. Consider: > >>=20 > >> { 'union': 'GlusterServer', > >> 'base': { 'type': 'GlusterTransport' }, > >> 'discriminator': 'type', > >> 'data': { 'unix': 'UnixSocketAddress', > >> 'tcp': 'InetSocketAddress' } } > >>=20 > >> Its doc comment currently doesn't contain argument sections. It > >> should > >> eventually contain @type:, @unix: and @tcp:. Only the latter two = are > >> members of 'data'. > >>=20 > >> I should propose something better, but I'm getting perilously clos= e to > >> the christmas break already. Later. > >>=20 > >> Also: passive voice is meh (not your idea; you adapted the existing > >> text). > >>=20 > >> What about: > >>=20 > >> The documentation block consists of a first line naming the > >> expression, an optional overview, a description of each argument (f= or > >> commands and events) or member (for structs, unions and alternates)= , > >> and optional tagged sections. > >>=20 > >> I still don't like the use of "section" for both the =3D things and th= e > >> tagged sections, but it's not important enough to spend time on it now= . > >>=20 > > > > ok > > > >> > + > >> > +Optional members are tagged with the phrase '#optional', often with > >>=20 > >> If we adopt my text, we should say "arguments / members" here. > >>=20 > > > > ok > > > >> > +their default value; and extensions added after the expression was > >> > +first released are also given a '(since x.y.z)' comment. > >> > + > >> > +A tagged section starts with one of the following words: > >> > +"Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO= :". > >>=20 > >> What ends such a section? The start of the next one / end of the > >> comment block? > >>=20 > > > > yes > > > >> I suspect we could avoid PATCH 03 by recognizing "TODO" in addition to > >> "TODO:". Might also be more robust. But let's ignore this now. > >>=20 > >> > + > >> > +A 'Since: x.y.z' tag lists the release that introduced the expressi= on. > >>=20 > >> Nitpick: 'Since: x.y.z' is a tagged section, not a tag. > >>=20 > > > > ok > > > >> > + > >> > +For example: > >> > + > >> > +## > >> > +# @BlockStats: > >> > +# > >> > +# Statistics of a virtual block device or a block backing device. > >> > +# > >> > +# @device: #optional If the stats are for a virtual block device, t= he > >> > name > >> > +# corresponding to the virtual block device. > >> > +# > >> > +# @node-name: #optional The node name of the device. (since 2.3) > >> > +# > >> > +# ... more members ... > >> > +# > >> > +# Since: 0.14.0 > >> > +## > >> > +{ 'struct': 'BlockStats', > >> > + 'data': {'*device': 'str', '*node-name': 'str', > >> > + ... more members ... } } > >> > + > >> > +## > >> > +# @query-blockstats: > >> > +# > >> > +# Query the @BlockStats for all virtual block devices. > >> > +# > >> > +# @query-nodes: #optional If true, the command will query all the > >> > +# block nodes ... explain, explain ... (since 2.3) > >> > +# > >> > +# Returns: A list of @BlockStats for each virtual block devices. > >> > +# > >> > +# Since: 0.14.0 > >> > +# > >> > +# Example: > >> > +# > >> > +# -> { "execute": "query-blockstats" } > >> > +# <- { > >> > +# ... lots of output ... > >> > +# } > >> > +# > >> > +## > >> > +{ 'command': 'query-blockstats', > >> > + 'data': { '*query-nodes': 'bool' }, > >> > + 'returns': ['BlockStats'] } > >> > + > >> > +=3D=3D=3D=3D Free-form documentation =3D=3D=3D=3D > >> > + > >> > +A documentation block that isn't an expression documentation block = is > >> > +a free-form documentation block. These may be used to provide > >> > +additional text and structuring content. > >> > + > >> > + > >> > +=3D=3D=3D Schema overview =3D=3D=3D > >> > =20 > >> > The schema sets up a series of types, as well as commands and event= s > >> > that will use those types. Forward references are allowed: the par= ser > >> [Tests look okay, snipped...] > >>=20 >=20