Re: [diamon-discuss] Common Trace Format 1.9 planning

From: Philippe Proulx <pproulx@efficios.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: diamon-discuss@lists.linuxfoundation.org
Subject: Re: [diamon-discuss] Common Trace Format 1.9 planning
Date: Wed, 25 Mar 2015 21:41:47 +0000 (UTC)	[thread overview]
Message-ID: <2022179888.104000.1427319707820.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <939295554.103805.1427317243385.JavaMail.zimbra@efficios.com>

----- Original Message -----
> From: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> To: "Simon Marchi" <simon.marchi@ericsson.com>, "Philippe Proulx" <pproulx@efficios.com>
> Cc: diamon-discuss@lists.linuxfoundation.org
> Sent: Wednesday, 25 March, 2015 5:00:43 PM
> Subject: Re: [diamon-discuss] Common Trace Format 1.9 planning
> 
> Philippe might want to chime in.

Yes, I might.

Comments below.

> 
> ----- Original Message -----
> > ----- Original Message -----
> > > On 15-03-25 03:22 PM, Mathieu Desnoyers wrote:
> > > > > This is great, so v1.9 is a superset of 1.8.
> > > >
> > > > No. A superset would be a completely backward compatible 1.9.
> > > > This is not the case here. We plan to do incompatible changes
> > > > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the
> > > > version is self-described, it should not be an issue to detect
> > > > the CTF version from the trace metadata.
> > > 
> > > From what I understand, it means that a reader for 1.9 won't be able to
> > > read
> > > a 1.8 trace, is that right?
> > 
> > Yes, we plan to add incompatible grammar changes. This might have to be
> > versioned as 2.0 then. However, we can have side-by-side implementations
> > of CTF 1.8 and 2.0 readers within a trace reading lib, and therefore
> > distinguish between those, and use the proper CTF reader implementation.
> > 
> > > 
> > > Do the version numbers mean something? If introducing non backward
> > > compatible
> > > changes only bumps the minor version, what could ever bump the major?
> > 
> > Good question! In this case, we might be talking about a CTF 2.0 then,
> > since we are planning non-compatible changes.
> > 
> > > 
> > > If they don't have one already, this could be an opportunity to give a
> > > meaning
> > > to the numbers. The major could be for breaking backward compatibility,
> > > while
> > > the minor could be for backward-compatible changes. It would mean that a
> > > reader
> > > for x.y should be able to read any x.z trace, where z <= y. In other
> > > words,
> > > the
> > > x.y format would be a superset of x.z (I think?). Much like semver.org,
> > > but
> > > without the PATCH level.
> > 
> > Yes, I think it's a good approach.
> > 
> > We have been hesitating between moving from CTF 1.8 to either 1.9 or 2.0.
> > Here are the upsides/downsides of each approach:
> > 
> > * 1.8 to 1.9:
> >   + Compatibility: New CTF 1.9 readers would be able to read CTF 1.8
> >   traces,
> >   - Incompatibility: CTF 1.8 readers would not be able to read CTF 1.9
> >   traces,
> >   - Complexity: The CTF 1.9 specification would need to be an exact subset

I guess you meant "superset" here.

> >     of 1.8, which means a more complex spec, grammar, and implementations,
> > 
> > * 1.8 to 2.0:
> >   + Compability: Since we're keeping the version headers, a trace reader
> >     can implement parsers for both CTF 1.8 and 2.0, and read both trace
> >     formats without requiring user interaction,
> >   - Incompatibility: CTF 1.8 readers would not be able to read CTF 2.0
> >   traces,
> >   + Simplicity: New CTF 2.0 readers would be simpler, since they would not
> >     need to read CTF 1.8 traces,
> > 
> > Since the user-visible impacts of bumping from 1.8 to 1.9 or from 1.8 to
> > 2.0
> > appear to be the same for existing implementations of the spec, I am really
> > tempted to bump to 2.0 at this stage. Already having the version number in
> > the header makes the transition so much easier to manage.
> > 
> > Thoughts ?

File formats have various ways of identifying their versions.

The "serial" approach is often found (single version). One example is
Microsoft's Bitmap format:
<http://upload.wikimedia.org/wikipedia/commons/c/c4/BMPfileFormat.png>.
Here, the DIB Header Size field indicates the size of the DIB Header, and
depending on this size, a Bitmap file reader knows what fields are available.
This header size may be considered as a serial version number. As the Bitmap
format evolved over the years, new fields were added, and thus the DIB Header
Size field value was increased each time. Old readers may still read newer
Bitmap files here: they just skip the unknown fields thanks to the header size,
and continue from there. The program will miss some information, but should
still be able to decode the image.

I think the major.minor approach should follow this backward-compatibility
approach using the minor version: a CTF reader implemented by reading the
CTF 1.8 specification, should be able to decode CTF 1.9 traces, CTF 1.10
traces, and so on, with no changes. It is expected that this same reader
would not be able to decode CTF 2.0 traces (major bump).

This means that, when a new minor version is released, features may only be
_added_, and added in a way that makes sure older readers supporting older
versions sharing the same major number can still decode the new format as they
previously did. Just as a well-designed API may add features when bumping
its minor version, but must ensure older applications will work with no
changes using this new version.

Here's an example: if hypothetical CTF 1.2 says:

    Integers are defined this way:

    integer {
        size = <size>;
        align = <alignment>;
        <other optional attributes here>
    }

    Other optional attributes may be ignored.

and CTF 1.3 says:

    Integers are defined this way:

    integer {
        size = <size>;
        align = <alignment>;
        base = <base>;
        <other optional attributes here>
    }

    Other optional attributes may be ignored.

Then, a CTF 1.2 reader would be able to read a CTF 1.3 trace, but it
would ignore the "base" attribute (which falls into an "other optional
attribute" as per CTF 1.2) and display all integers in base 10.

A CTF 1.3 reader would know this "base" field, and treat it specially,
displaying integers with the provided radix.

The idea is letting some free space in the specification, like this
"<other optional attributes here>", for future minor revisions of the
format. In binary formats, this is usually done with a fixed-offset field
providing the header size, like Bitmap's DIB Header Size field. As
long as the purpose of this fixed-offset field is known in the first minor
version of a given major version, future, custom fields may be added
at will without breaking older decoders. For a text-based format like
TSDL, letting some free space means relaxing the grammar so that
unknown blocks, fields or attributes are still parseable, but have no
attached semantics (yet).

tl;dr: Unless the current version of Babeltrace/TraceCompass will be
able to read the next version of CTF with no change, it should be
CTF 2.0.

Phil

> > 
> > Thanks!
> > 
> > Mathieu
> > 
> > --
> > Mathieu Desnoyers
> > EfficiOS Inc.
> > http://www.efficios.com
> > _______________________________________________
> > diamon-discuss mailing list
> > diamon-discuss@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
> > 
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>