From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: libvir-list@redhat.com, "Ján Tomko" <jtomko@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance
Date: Tue, 7 May 2019 10:39:54 +0100 [thread overview]
Message-ID: <20190507093954.GG27205@redhat.com> (raw)
In-Reply-To: <87sgtqejn9.fsf@dusky.pond.sub.org>
On Tue, May 07, 2019 at 10:47:06AM +0200, Markus Armbruster wrote:
> > The Golang JSON parser decodes JSON numbers to float64 by default so
> > will have this precision limitation too, though at least they provide
> > a backdoor for custom parsing from the original serialized representation.
> >
> >> > QEMU, and indeed many applications, want to handle 64-bit integers.
> >> > The C JSON library impls have traditionally mapped integers to the
> >> > data type 'long long int' which gives a min/max of -(2^63) / 2^63-1.
> >> >
> >> > QEMU however /really/ needs 64-bit unsigned integers, ie a max 2^64-1.
>
> Correct.
>
> Support for integers 2^63..2^64-1 is relatively recent: commit
> 2bc7cfea095 (v2.10, 2017).
>
> Since we really needed these, the QObject input visitor silently casts
> negative integers to uint64_t. It still does for backward
> compatibility. Commit 5923f85fb82 (right after 2bc7cfea095) explains
>
> The input visitor will cast i64 input to u64 for compatibility
> reasons (existing json QMP client already use negative i64 for large
> u64, and expect an implicit cast in qemu).
>
> Note: before the patch, uint64_t values above INT64_MAX are sent over
> json QMP as negative values, e.g. UINT64_MAX is sent as -1. After the
> patch, they are sent unmodified. Clearly a bug fix, but we have to
> consider compatibility issues anyway. libvirt should cope fine,
> because its parsing of unsigned integers accepts negative values
> modulo 2^64. There's hope that other clients will, too.
So QEMU reading stuff sent by libvirt in a back compatible manner is
ok. The problem was specifically when a QEMU reply sent back UINT64_MAX
value as a positive number.
> >> > Libvirt has historically used the YAJL library which uses 'long long int'
> >> > and thus can't officially go beyond 2^63-1 values. Fortunately it lets
> >> > libvirt get at the raw json string, so libvirt can re-parse the value
> >> > to get an 'unsigned long long'.
> >> >
> >> > We recently tried to switch to Jansson because YAJL has a dead upstream
> >> > for many years and countless unanswered bugs & patches. Unfortunately we
> >> > forgot about this need for 2^64-1 max, and Jansson also uses 'long long int'
> >> > and raises a fatal parse error for unsigned 64-bit values above 2^63-1. It
> >> > also provides no backdoor for libvirt todo its own integer parsing. Thus
> >> > we had to abort our switch to jansson as it broke parsing QEMU's JSON:
> >> >
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1614569
> >> >
> >> > Other JSON libraries we've investigated have similar problems. I imagine
> >> > the same may well be true of non-C based JOSN impls, though I've not
> >> > investigated in any detail.
> >> >
> >> > Essentially libvirt is stuck with either using the dead YAJL library
> >> > forever, or writing its own JSON parser (most likely copying QEMU's
> >> > JSON code into libvirt's git).
> >> >
> >> > This feels like a very unappealing situation to be in as not being
> >> > able to use a JSON library of our choice is loosing one of the key
> >> > benefits of using a standard data format.
> >> >
> >> > Thus I'd like to see a solution to this to allow QMP to be reliably
> >> > consumed by any JSON library that exists.
>
> JSON is terrible at interoperability, so good luck with that.
>
> If you reduce your order to "the commonly used JSON libraries we know",
> we can talk.
I don't particularly want us to rely on semantics of small known set
of JSON libs. I really do want us to do something that is capable of
working with any JSON impl that exists in any programming language.
My suggested option 2 & 3 at least would manage that I believe, as
any credible JSON impl will be able to represent 32-bit integers
or strings without loosing data.
Option 1 would not cope as some impls can't even cope with
signed 64-bit ints.
> >> > I can think of some options:
> >> >
> >> > 1. Encode unsigned 64-bit integers as signed 64-bit integers.
> >> >
> >> > This follows the example that most C libraries map JSON ints
> >> > to 'long long int'. This is still relying on undefined
> >> > behaviour as apps don't need to support > 2^53-1.
> >> >
> >> > Apps would need to cast back to 'unsigned long long' for
> >> > those QMP fields they know are supposed to be unsigned.
>
> Ugly. It's also what we did until v2.10, August 2017. QMP's input
> direction still does it, for backward compatibility.
>
> >> >
> >> >
> >> > 2. Encode all 64-bit integers as a pair of 32-bit integers.
> >> >
> >> > This is fully compliant with the JSON spec as each half
> >> > is fully within the declared limits. App has to split or
> >> > assemble the 2 pieces from/to a signed/unsigned 64-bit
> >> > int as needed.
>
> Differently ugly.
>
> >> >
> >> >
> >> > 3. Encode all 64-bit integers as strings
> >> >
> >> > The application has todo all parsing/formatting client
> >> > side.
>
> Yet another ugly.
>
> >> >
> >> >
> >> > None of these changes are backwards compatible, so I doubt we could make
> >> > the change transparently in QMP. Instead we would have to have a
> >> > QMP greeting message capability where the client can request enablement
> >> > of the enhanced integer handling.
>
> We might be able to do option 1 without capability negotiation. v2.10's
> change from option 1 to what we have now produced zero complaints.
>
> On the other hand, we made that change for a reason, so we may want a
> "send large integers as negative integers" capability regardless.
>
> >> >
> >> > Any of the three options above would likely work for libvirt, but I
> >> > would have a slight preference for either 2 or 3, so that we become
> >> > 100% standards compliant.
>
> There's no such thing. You mean "we maximize interoperability with
> common implementations of JSON".
s/common/any/
> Let's talk implementation for a bit.
>
> Encoding and decoding integers in funny ways should be fairly easy in
> the QObject visitors. The generated QMP marshallers all use them.
> Trouble is a few commands still bypass the generated marshallers, and
> mess with the QObject themselves:
>
> * query-qmp-schema: minor hack explained in qmp_query_qmp_schema()'s
> comment. Should be harmless.
>
> * netdev_add: not QAPIfied. Eric's patches to QAPIfy it got stuck
> because they reject some abuses like passing numbers and bools as
> strings.
>
> * device_add: not QAPIfied. We're not sure QAPIfication is feasible.
>
> netdev_add and device_add both use qemu_opts_from_qdict(). Perhaps we
> could hack that to mirror what the QObject visitor do.
>
> Else, we might have to do it in the JSON parser. Should be possible,
> but I'd rather not.
>
> >> My preference would be 3 with the strings defined as being
> >> %x lower case hex formated with a 0x prefix and no longer than 18 characters
> >> ("0x" + 16 nybbles). Zero padding allowed but not required.
> >> It's readable and unambiguous when dealing with addresses; I don't want
> >> to have to start decoding (2) by hand when debugging.
> >
> > Yep, that's a good point about readability.
>
> QMP sending all integers in decimal is inconvenient for some values,
> such as addresses. QMP sending all (large) integers in hexadecimal
> would be inconvenient for other values.
>
> Let's keep it simple & stupid. If you want sophistication, JSON is the
> wrong choice.
>
>
> Option 1 feels simplest.
But will still fail with any JSON impl that uses double precision floating
point for integers as it will loose precision.
> Option 2 feels ugliest. Less simple, more interoperable than option 1.
If we assume any JSON impl can do 32-bit integers without loss of
precision, then I think we can say it is guaranteed portable, but
it is certainly horrible / ugly.
> Option 3 is like option 2, just not quite as ugly.
I think option 3 can be guaranteed to be loss-less with /any/ JSON impl
that exists, since you're delegating all string -> int conversion to
the application code taking the JSON parser/formatter out of the equation.
This is close to the approach libvirt takes with YAJL parser today. YAJL
parses as a int64 and we then ignore its result, and re-parse the string
again in libvirt as uint64. When generating json we format as uint64
in libvirt and ignore YAJLs formatting for int64.
> Can we agree to eliminate option 2 from the race?
I'm fine with eliminating option 2.
I guess I'd have a preference for option 3 given that it has better
interoperability
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2019-05-07 9:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-30 13:19 [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance Daniel P. Berrangé
2019-04-30 13:19 ` Daniel P. Berrangé
2019-04-30 14:45 ` Dr. David Alan Gilbert
2019-04-30 14:45 ` Dr. David Alan Gilbert
2019-04-30 15:05 ` Daniel P. Berrangé
2019-04-30 15:05 ` Daniel P. Berrangé
2019-05-07 8:47 ` Markus Armbruster
2019-05-07 9:39 ` Daniel P. Berrangé [this message]
2019-05-07 16:32 ` Eric Blake
2019-05-08 12:37 ` Markus Armbruster
2019-05-08 12:44 ` Dr. David Alan Gilbert
2019-05-08 12:44 ` Markus Armbruster
2019-05-13 12:08 ` Daniel P. Berrangé
2019-05-13 12:29 ` Dr. David Alan Gilbert
2019-05-13 12:35 ` Daniel P. Berrangé
2019-05-13 14:10 ` Markus Armbruster
2019-05-13 13:53 ` Markus Armbruster
2019-05-13 14:10 ` Daniel P. Berrangé
2019-05-13 15:15 ` [Qemu-devel] [libvirt] " Eric Blake
2019-05-14 6:02 ` Markus Armbruster
2019-05-14 9:26 ` Daniel P. Berrangé
2019-05-14 9:37 ` Dr. David Alan Gilbert
2019-05-14 9:43 ` Daniel P. Berrangé
2019-05-14 9:47 ` Peter Krempa
2019-06-04 6:38 ` [Qemu-devel] " Markus Armbruster
2019-06-05 13:06 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190507093954.GG27205@redhat.com \
--to=berrange@redhat.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=jtomko@redhat.com \
--cc=libvir-list@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).