All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Eduardo Habkost" <ehabkost@redhat.com>,
	qemu-block@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, "Yuval Shaia" <yuval.shaia.ml@gmail.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Andrea Bolognani" <abologna@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"John Snow" <jsnow@redhat.com>,
	"Michael Roth" <mdroth@linux.vnet.ibm.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Stefan Berger" <stefanb@linux.ibm.com>
Subject: Re: [PATCH] schemas: Add vim modeline
Date: Fri, 31 Jul 2020 14:55:34 +0200	[thread overview]
Message-ID: <875za33ku1.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <20200730132446.GL3477223@redhat.com> ("Daniel P. Berrangé"'s message of "Thu, 30 Jul 2020 14:24:46 +0100")

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Jul 30, 2020 at 01:51:10PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> >                               modify them so that we can load the 
>> > files straight into the python intepretor as code, and not parse 
>> > them as data. I feel unhappy about treating data as code though.
>> 
>> Stress on *can* load.  Doesn't mean we should.
>> 
>> Ancient prior art: Lisp programs routinely use s-expressions as
>> configuration file syntax.  They don't load them as code, they read them
>> as data.
>> 
>> With Python, it's ast.parse(), I think.
>
> Yes, that could work
>
>
>> > struct: ImageInfoSpecificQCow2
>> > data:
>> >   compat: str
>> >   "*data-file": str
>> >   "*data-file-raw": bool
>> >   "*lazy-refcounts": bool
>> >   "*corrupt": bool
>> >   refcount-bits: int
>> >   "*encrypt": ImageInfoSpecificQCow2Encryption
>> >   "*bitmaps":
>> >     - Qcow2BitmapInfo
>> >   compression-type: Qcow2CompressionType
>> >
>> >
>> > Then we could use a regular off the shelf YAML parser in python.
>> >
>> > The uglyiness with quotes is due to the use of "*". Slightly less ugly
>> > if we simply declare that quotes are always used, even where they're
>> > not strictly required.
>> 
>> StrictYAML insists on quotes.
>
> I wouldn't suggest StrictYAML, just normal YAML is what pretty much
> everyone uses.
>
> If we came up with a different way to mark a field as optional
> instead of using the magic "*" then we wouldn't need to quote
> anything

Nope.

Stupid test script:

    $ cat test-yaml.py
    #!/usr/bin/python3

    import sys, yaml

    data = yaml.load(sys.stdin, Loader=yaml.CLoader)
    print(data)

Example taken from block.json:

    $ ./test-yaml.py <<EOF
    enum: FloppyDriveType
    data:
    - 144
    - 288
    - 120
    - none
    - auto
    EOF
    {'enum': 'FloppyDriveType', 'data': [144, 288, 120, 'none', 'auto']}

The upper layer will choke on this:

    qapi/block.yaml: In enum 'FloppyDriveType':
    qapi/block.yaml:61: 'data' member requires a string name

You could propose to provide "wouldn't need to quote anything" by
silently converting numbers back to strings.  I got two issues with
that.

1. What's the point in switching to an off-the-shelf parser to replace
less than 400 SLOC if I then have to write non-trivial glue code to make
the syntax less surprising?

2. Let me trot out the tired Norway problem again.  Say we QAPIfy -k so
that its argument is an enum, for nice introspection.  Something like

    $ ./test-yaml.py <<EOF
    enum: Keymap
    data:
    - ar
    - cz
    - de
    - de-ch
    - en-gb
    - en-us
    - no
    - pt
    - pt-br
    EOF
    {'enum': 'Keymap', 'data': ['ar', 'cz', 'de', 'de-ch', 'en-gb', 'en-us', False, 'pt', 'pt-br']}

To which of the eleven ways to say False in YAML should we convert?

YAML and the schema language are fundamentally at odds here: they fight
over types.  YAML lets the value determine the type regardless of
context.  But in the schema, the context determines the type.

>> I hate having to quote identifiers.  There's a reason we don't write
>> 
>>     'int'
>>     'main'('int', 'argc', 'char' *'argv'[])
>>     {
>>         'printf'("hello world\n");
>>         return 0;
>>     }
>> 
>> > struct: ImageInfoSpecificQCow2
>> > data:
>> >   "compat": "str"
>> >   "*data-file": "str"
>> >   "*data-file-raw": "bool"
>> >   "*lazy-refcounts": "bool"
>> >   "*corrupt": "bool"
>> >   "refcount-bits": "int"
>> >   "*encrypt": "ImageInfoSpecificQCow2Encryption"
>> >   "*bitmaps":
>> >     - "Qcow2BitmapInfo"
>> >   "compression-type": "Qcow2CompressionType"
>> >
>> > With the use of "---" to denote the start of document, we have no trouble 
>> > parsing our files which would actually be a concatenation of multiple 
>> > documents. The python YAML library provides the easy yaml.load_all()
>> > method.
>> 
>> Required reading on YAML:
>> https://www.arp242.net/yaml-config.html
>
> I don't think this is especially helpful to our evaluation. You can write
> such blog posts about pretty much any thing if you want to pick holes in a
> proposal. Certainly there's plenty of awful stuff you can write about
> JSON, and Python.

Picking holes in a proposal is precisely what we need to do before we
act on it and rebase our QAPI schema DSL.  That's expensive and painful,
so we better don't screw it up *again*.

Here's my superficial five minute assessment of the essay's main points
for our use case:

* Insecure by default

  Valid criticism of existing YAML tools, but hardly relevant for us,
  because 1. "don't do that then", and 2. the QAPI schema is trusted
  input.

* Can be hard to edit, especially for large files

  Valid when you use YAML to describe data.  We would use it as an IDL,
  though.  If parts of the interface description get too deeply nested
  or too long for comfort, we better provide means to rearrange it in
  more pleasant ways.  However, if our new base language got
  uncomfortable earlier than the old one, and the existing means to
  rearrange prove to weak (they are pretty weak), then we'd create
  additional work.

  I'm cautiously optimistic that YAML would do okay here.

* It’s pretty complex

  If you go "we'll use only a simple subset", then I go "define the
  subset, and tell me how to enforce the subset.  I have no interest in
  teaching contributors one by one which of the nine ways to write a
  multiline string to avoid, or which of the eleven ways to say False to
  use.

* Surprising behavior

  See "fight over types" above.

* It’s not portable

  Probably irrelevant, because feeding the schema to another YAML parser
  is so unlikely to be useful.

>> Some of the criticism there doesn't matter for our use case.
>
> Yeah, what matters is whether it can do the job we need in a way that is
> better than what we have today, and whether there are any further options
> to consider that might be viable alternatives.

Would it improve things enough to be worth the switching pain?



  parent reply	other threads:[~2020-07-31 12:56 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-29 18:50 [PATCH] schemas: Add vim modeline Andrea Bolognani
2020-07-30  9:07 ` Markus Armbruster
2020-07-30  9:37   ` Daniel P. Berrangé
2020-07-30 11:51     ` Markus Armbruster
2020-07-30 13:24       ` Daniel P. Berrangé
2020-07-31  6:45         ` John Snow
2020-07-31  9:21           ` Markus Armbruster
2020-07-31  9:32             ` Daniel P. Berrangé
2020-07-31 12:55         ` Markus Armbruster [this message]
2020-07-31 15:07           ` Daniel P. Berrangé
2020-07-31 15:26             ` John Snow
2020-07-31 15:44               ` Daniel P. Berrangé
2020-08-03  7:28                 ` Markus Armbruster
2020-08-03  8:41                 ` Paolo Bonzini
2020-08-03 11:24                   ` Markus Armbruster
2020-08-03 11:36                   ` Daniel P. Berrangé
2020-08-03 12:16                     ` Paolo Bonzini
2020-08-03 12:23                       ` Daniel P. Berrangé
2020-08-03 12:33                         ` Paolo Bonzini
2020-08-03 12:43                           ` Daniel P. Berrangé
2020-08-03 15:48                         ` Markus Armbruster
2020-08-03 21:02                         ` Nir Soffer
2020-07-31 16:35               ` Paolo Bonzini
2020-07-31 16:41                 ` Dr. David Alan Gilbert
2020-07-31 17:20                 ` Daniel P. Berrangé
2020-07-31 17:47                   ` Paolo Bonzini
2020-08-03  9:44                     ` Daniel P. Berrangé
2020-07-31 17:53                 ` John Snow
2020-07-31 18:01                   ` Paolo Bonzini
2020-08-03  7:45                     ` Markus Armbruster
2020-07-31 16:28             ` cleanups with long-term benefits (was Re: [PATCH] schemas: Add vim modeline) Paolo Bonzini
2020-07-31 17:05               ` Daniel P. Berrangé
2020-07-31 17:16                 ` Paolo Bonzini
2020-07-31 17:27                   ` Daniel P. Berrangé
2020-07-31 17:42                     ` Paolo Bonzini
2020-08-03  9:27                       ` Daniel P. Berrangé
2020-08-03  8:18               ` Markus Armbruster
2020-08-03  8:42                 ` Paolo Bonzini
2020-08-03 11:28                   ` Markus Armbruster
2020-08-03 12:01                     ` Paolo Bonzini
2020-08-03 16:03                       ` Markus Armbruster
2020-08-03 16:36                         ` Kevin Wolf
2020-08-04  7:28                           ` Markus Armbruster
2020-08-03 17:19                         ` Paolo Bonzini
2020-08-04  8:03                           ` Markus Armbruster
2020-08-04 18:24                             ` John Snow
2020-08-05  7:36                               ` Markus Armbruster
2020-08-05  8:25                                 ` Paolo Bonzini
2020-08-05  8:39                                   ` Dr. David Alan Gilbert
2020-08-05  8:49                                     ` Paolo Bonzini
2020-08-05  9:05                                       ` Daniel P. Berrangé
2020-08-05  9:11                                         ` cleanups with long-term benefits Cornelia Huck
2020-08-05 10:08                                           ` Daniel P. Berrangé
2020-08-05 10:24                                             ` Cornelia Huck
2020-08-05 16:23                                             ` Kevin Wolf
2020-08-05 16:46                                               ` Eduardo Habkost
2020-08-06  5:44                                                 ` Markus Armbruster
2020-08-05  8:47                                   ` Cornelia Huck
2020-08-05  8:56                                   ` cleanups with long-term benefits (was Re: [PATCH] schemas: Add vim modeline) Markus Armbruster
2020-08-05 10:15                                     ` Alex Bennée
2020-08-05 16:04                                 ` John Snow
2020-08-06  4:58                                   ` Markus Armbruster
2020-08-05  8:42                         ` Markus Armbruster
2020-08-03 18:10                       ` John Snow
2020-08-03 18:16                         ` Paolo Bonzini
2020-08-03 18:19                           ` John Snow
2020-08-03 19:54                             ` Nir Soffer
2020-08-03 20:48                               ` John Snow
2020-08-03  9:50                 ` Daniel P. Berrangé
2020-08-03 11:32                   ` Markus Armbruster
2020-07-31 16:39             ` [PATCH] schemas: Add vim modeline Kevin Wolf
2020-07-30 15:11       ` Eric Blake
2020-07-30 20:53         ` John Snow
2020-07-30 20:56         ` John Snow
2020-07-31  7:15         ` Kevin Wolf
2020-07-31  8:48           ` Daniel P. Berrangé
2020-07-31  9:01           ` Markus Armbruster
2020-07-31 11:26             ` Kevin Wolf
2020-08-03  8:51               ` Markus Armbruster
2020-07-31 23:12     ` Nir Soffer
2020-08-03 12:16       ` Paolo Bonzini
2020-08-04  7:28         ` Markus Armbruster
2020-08-04  8:29     ` Alex Bennée
2020-09-07 13:54   ` Markus Armbruster
2020-07-30 13:14 ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875za33ku1.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=abologna@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanb@linux.ibm.com \
    --cc=yuval.shaia.ml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.