qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, libvir-list@redhat.com,
	"Ján Tomko" <jtomko@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>
Subject: Re: [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance
Date: Tue, 30 Apr 2019 16:05:56 +0100	[thread overview]
Message-ID: <20190430150556.GA2423@redhat.com> (raw)
In-Reply-To: <20190430144546.GA3065@work-vm>

On Tue, Apr 30, 2019 at 03:45:46PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > The QEMU  QMP service is based on JSON which is nice because that is a
> > widely supported "standard" data format.....
> > 
> > ....except QEMU's implementation (and indeed most impls) are not strictly
> > standards compliant.
> > 
> > Specifically the problem is around representing 64-bit integers, whether
> > signed or unsigned.
> > 
> > The JSON standard declares that largest integer is 2^53-1 and the
> > likewise the smallest is -(2^53-1):
> > 
> >   http://www.ecma-international.org/ecma-262/6.0/index.html#sec-number.max_safe_integer
> > 
> > A crazy limit inherited from its javascript origins IIUC.
> 
> Ewwww.

Looking a bit deeper it seems this limit comes from the use of double
precision floating point for storing integers. 2^53-1 is the largest
integer value that can be stored in a 64-bit float without loss of
precision.

The Golang JSON parser decodes JSON numbers to float64 by default so
will have this precision limitation too, though at least they provide
a backdoor for custom parsing from the original serialized representation.

> > QEMU, and indeed many applications, want to handle 64-bit integers.
> > The C JSON library impls have traditionally mapped integers to the
> > data type 'long long int' which gives a min/max of  -(2^63) / 2^63-1.
> > 
> > QEMU however /really/ needs 64-bit unsigned integers, ie a max 2^64-1.
> > 
> > Libvirt has historically used the YAJL library which uses 'long long int'
> > and thus can't officially go beyond 2^63-1 values. Fortunately it lets
> > libvirt get at the raw json string, so libvirt can re-parse the value
> > to get an 'unsigned long long'.
> > 
> > We recently tried to switch to Jansson because YAJL has a dead upstream
> > for many years and countless unanswered bugs & patches. Unfortunately we
> > forgot about this need for 2^64-1 max, and Jansson also uses 'long long int'
> > and raises a fatal parse error for unsigned 64-bit values above 2^63-1. It
> > also provides no backdoor for libvirt todo its own integer parsing. Thus
> > we had to abort our switch to jansson as it broke parsing QEMU's JSON:
> > 
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1614569
> > 
> > Other JSON libraries we've investigated have similar problems. I imagine
> > the same may well be true of non-C based JOSN impls, though I've not
> > investigated in any detail.
> > 
> > Essentially libvirt is stuck with either using the dead YAJL library
> > forever, or writing its own JSON parser (most likely copying QEMU's
> > JSON code into libvirt's git).
> > 
> > This feels like a very unappealing situation to be in as not being
> > able to use a JSON library of our choice is loosing one of the key
> > benefits of using a standard data format.
> > 
> > Thus I'd like to see a solution to this to allow QMP to be reliably
> > consumed by any JSON library that exists.
> > 
> > I can think of some options:
> > 
> >   1. Encode unsigned 64-bit integers as signed 64-bit integers.
> > 
> >      This follows the example that most C libraries map JSON ints
> >      to 'long long int'. This is still relying on undefined
> >      behaviour as apps don't need to support > 2^53-1.
> > 
> >      Apps would need to cast back to 'unsigned long long' for
> >      those QMP fields they know are supposed to be unsigned.
> > 
> > 
> >   2. Encode all 64-bit integers as a pair of 32-bit integers.
> >     
> >      This is fully compliant with the JSON spec as each half
> >      is fully within the declared limits. App has to split or
> >      assemble the 2 pieces from/to a signed/unsigned 64-bit
> >      int as needed.
> > 
> > 
> >   3. Encode all 64-bit integers as strings
> > 
> >      The application has todo all parsing/formatting client
> >      side.
> > 
> > 
> > None of these changes are backwards compatible, so I doubt we could make
> > the change transparently in QMP.  Instead we would have to have a
> > QMP greeting message capability where the client can request enablement
> > of the enhanced integer handling.
> > 
> > Any of the three options above would likely work for libvirt, but I
> > would have a slight preference for either 2 or 3, so that we become
> > 100% standards compliant.
> 
> My preference would be 3 with the strings defined as being
> %x lower case hex formated with a 0x prefix and no longer than 18 characters
> ("0x" + 16 nybbles). Zero padding allowed but not required.
> It's readable and unambiguous when dealing with addresses; I don't want
> to have to start decoding (2) by hand when debugging.

Yep, that's a good point about readability.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

WARNING: multiple messages have this Message-ID (diff)
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: libvir-list@redhat.com, "Ján Tomko" <jtomko@redhat.com>,
	qemu-devel@nongnu.org, "Markus Armbruster" <armbru@redhat.com>
Subject: Re: [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance
Date: Tue, 30 Apr 2019 16:05:56 +0100	[thread overview]
Message-ID: <20190430150556.GA2423@redhat.com> (raw)
Message-ID: <20190430150556.oARM6yIpDUKksWaJijhUx21vOvdqOeQNy59rqCU5cMo@z> (raw)
In-Reply-To: <20190430144546.GA3065@work-vm>

On Tue, Apr 30, 2019 at 03:45:46PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > The QEMU  QMP service is based on JSON which is nice because that is a
> > widely supported "standard" data format.....
> > 
> > ....except QEMU's implementation (and indeed most impls) are not strictly
> > standards compliant.
> > 
> > Specifically the problem is around representing 64-bit integers, whether
> > signed or unsigned.
> > 
> > The JSON standard declares that largest integer is 2^53-1 and the
> > likewise the smallest is -(2^53-1):
> > 
> >   http://www.ecma-international.org/ecma-262/6.0/index.html#sec-number.max_safe_integer
> > 
> > A crazy limit inherited from its javascript origins IIUC.
> 
> Ewwww.

Looking a bit deeper it seems this limit comes from the use of double
precision floating point for storing integers. 2^53-1 is the largest
integer value that can be stored in a 64-bit float without loss of
precision.

The Golang JSON parser decodes JSON numbers to float64 by default so
will have this precision limitation too, though at least they provide
a backdoor for custom parsing from the original serialized representation.

> > QEMU, and indeed many applications, want to handle 64-bit integers.
> > The C JSON library impls have traditionally mapped integers to the
> > data type 'long long int' which gives a min/max of  -(2^63) / 2^63-1.
> > 
> > QEMU however /really/ needs 64-bit unsigned integers, ie a max 2^64-1.
> > 
> > Libvirt has historically used the YAJL library which uses 'long long int'
> > and thus can't officially go beyond 2^63-1 values. Fortunately it lets
> > libvirt get at the raw json string, so libvirt can re-parse the value
> > to get an 'unsigned long long'.
> > 
> > We recently tried to switch to Jansson because YAJL has a dead upstream
> > for many years and countless unanswered bugs & patches. Unfortunately we
> > forgot about this need for 2^64-1 max, and Jansson also uses 'long long int'
> > and raises a fatal parse error for unsigned 64-bit values above 2^63-1. It
> > also provides no backdoor for libvirt todo its own integer parsing. Thus
> > we had to abort our switch to jansson as it broke parsing QEMU's JSON:
> > 
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1614569
> > 
> > Other JSON libraries we've investigated have similar problems. I imagine
> > the same may well be true of non-C based JOSN impls, though I've not
> > investigated in any detail.
> > 
> > Essentially libvirt is stuck with either using the dead YAJL library
> > forever, or writing its own JSON parser (most likely copying QEMU's
> > JSON code into libvirt's git).
> > 
> > This feels like a very unappealing situation to be in as not being
> > able to use a JSON library of our choice is loosing one of the key
> > benefits of using a standard data format.
> > 
> > Thus I'd like to see a solution to this to allow QMP to be reliably
> > consumed by any JSON library that exists.
> > 
> > I can think of some options:
> > 
> >   1. Encode unsigned 64-bit integers as signed 64-bit integers.
> > 
> >      This follows the example that most C libraries map JSON ints
> >      to 'long long int'. This is still relying on undefined
> >      behaviour as apps don't need to support > 2^53-1.
> > 
> >      Apps would need to cast back to 'unsigned long long' for
> >      those QMP fields they know are supposed to be unsigned.
> > 
> > 
> >   2. Encode all 64-bit integers as a pair of 32-bit integers.
> >     
> >      This is fully compliant with the JSON spec as each half
> >      is fully within the declared limits. App has to split or
> >      assemble the 2 pieces from/to a signed/unsigned 64-bit
> >      int as needed.
> > 
> > 
> >   3. Encode all 64-bit integers as strings
> > 
> >      The application has todo all parsing/formatting client
> >      side.
> > 
> > 
> > None of these changes are backwards compatible, so I doubt we could make
> > the change transparently in QMP.  Instead we would have to have a
> > QMP greeting message capability where the client can request enablement
> > of the enhanced integer handling.
> > 
> > Any of the three options above would likely work for libvirt, but I
> > would have a slight preference for either 2 or 3, so that we become
> > 100% standards compliant.
> 
> My preference would be 3 with the strings defined as being
> %x lower case hex formated with a 0x prefix and no longer than 18 characters
> ("0x" + 16 nybbles). Zero padding allowed but not required.
> It's readable and unambiguous when dealing with addresses; I don't want
> to have to start decoding (2) by hand when debugging.

Yep, that's a good point about readability.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


  parent reply	other threads:[~2019-04-30 15:06 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-30 13:19 [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance Daniel P. Berrangé
2019-04-30 13:19 ` Daniel P. Berrangé
2019-04-30 14:45 ` Dr. David Alan Gilbert
2019-04-30 14:45   ` Dr. David Alan Gilbert
2019-04-30 15:05   ` Daniel P. Berrangé [this message]
2019-04-30 15:05     ` Daniel P. Berrangé
2019-05-07  8:47     ` Markus Armbruster
2019-05-07  9:39       ` Daniel P. Berrangé
2019-05-07 16:32         ` Eric Blake
2019-05-08 12:37           ` Markus Armbruster
2019-05-08 12:44             ` Dr. David Alan Gilbert
2019-05-08 12:44         ` Markus Armbruster
2019-05-13 12:08           ` Daniel P. Berrangé
2019-05-13 12:29             ` Dr. David Alan Gilbert
2019-05-13 12:35               ` Daniel P. Berrangé
2019-05-13 14:10                 ` Markus Armbruster
2019-05-13 13:53             ` Markus Armbruster
2019-05-13 14:10               ` Daniel P. Berrangé
2019-05-13 15:15               ` [Qemu-devel] [libvirt] " Eric Blake
2019-05-14  6:02                 ` Markus Armbruster
2019-05-14  9:26                   ` Daniel P. Berrangé
2019-05-14  9:37                     ` Dr. David Alan Gilbert
2019-05-14  9:43                       ` Daniel P. Berrangé
2019-05-14  9:47                         ` Peter Krempa
2019-06-04  6:38 ` [Qemu-devel] " Markus Armbruster
2019-06-05 13:06   ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190430150556.GA2423@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=jtomko@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).