* [diamon-discuss] Common Trace Format 1.9 planning [not found] <470558667.103090.1427308587048.JavaMail.zimbra@efficios.com> @ 2015-03-25 18:39 ` Mathieu Desnoyers 2015-03-25 19:09 ` Matthew Khouzam 0 siblings, 1 reply; 8+ messages in thread From: Mathieu Desnoyers @ 2015-03-25 18:39 UTC (permalink / raw) To: diamon-discuss Hi, As discussed on the workgroup call today, I am posting the list of changes I have gathered for CTF 1.9. Feedback is welcome! * Add new set of features to Common Trace Format and to the Babeltrace reference implementation. 1. Handle transition from CTF 1.8 to 1.9, including compatibility and upgrade path for users. We must take great care when extending the Common Trace Format specification in non-fully backward compatible way, so that users do not suffer from this transition. We also want to minimize the frequency at which we do those non-compatible changes, by bundling those changes all within one release, and by ensuring, for the future, that the format can be extended with "features" that are simply ignored by a parser implementation that does not know about them. Since the description format contains version numbering, we can keep CTF 1.8 and CTF 1.9 parsers side-by-side in the trace reader, so users gathering CTF 1.8 traces can use the new tools to view them. Only users generating new trace format (CTF 1.9) will need the new tools to view them. 2. Add base address and symbol information support. The Perf developers expressed strong interest for add base address and symbol information support within the Common Trace Format at the Trace Summit 2014 [3]. This requirement seems to be interesting for most of the community. In the spirit of keeping the Common Trace Format dedicated as much as possible to describe the layout of the binary trace, we intend to keep the base address and symbol information separate from the CTF metadata file. This can be achieved by creating a stream that contains the following object load and unload events: [ object load, timestamp, address space, address, mapping length, object path ] [ object unload, timestamp, address space, address ] This should allow describing load/unload events for the kernel, and for each user-space process. Initially, we can introduce those events into LTTng and Perf-to-CTF in the same way through discussion. 3. Add event versioning. Experience with the Linux kernel Tracepoints shows us that relying on event and field names is not sufficient for binding a trace analysis on the semantic of the kernel or application. Sometimes, changes in the software keep the same event and field name, but change their semantic. A good example of such a change is the wakeup delivery within the Linux kernel which has moved from the context of the thread performing the wake up to an inter-processor interrupt performing the wake-up on a remote CPU. This is an issue for critical path analysis, and it needs to know about this semantic change to the PID field, although the event and field names are unchanged. Therefore, introduce an optional versioning property for the event, which allow trace analysis and models to track which semantic it is tracking. 4. Add field name prefix '$' to eliminate conflicts with reserved keywords. Currently, the Common Trace Format metadata uses a '_' prefix to mitigate conflicts with reserved keywords. However, it was a bad idea, because '_' is a legitimate character part of the identifiers. This creates some confusion in the cases where an event name is indeed prefixed by '_' already: the underscore is either removed when it should not be removed, or '__' is then necessary. Fix this confusion by introducing the '$' optional prefix for identifiers. 5. Add attributes for type and field reference. Turn the integer and floating point values description into attributes attached to the type: integer ( attribute-list ) floating_point ( attribute-list ) This will allow using the same concept of attributes on compound types, e.g.: struct { ... } ( attribute-list ) variant <tag> { ... } ( attribute-list ) int32_t [10] ( attribute-list ) attribute-list is a comma-separated list of: identifier = expression For instance, in the case of event pretty-printing, this can be used as: format = "content of format string" Where the content of the format string follows what has been proposed in the report on layout description. 6. Add pretty-printing "hints", specified outside of CTF, within either another specification, or an appendix. This will allow splitting data layout (CTF) from formatting. The goal here is to allow the tracer implementation (e.g. LTTng-UST) to specify additional attributes attached to a CTF type. The two attributes we will implement in LTTng-UST and Babeltrace are: * format = "{field1}: {field3:02x} ({event.context.allo})" The format string formatting would follow the Python format string syntax [4]. The choice of this syntax over printf-alike is due to the ability to refer to specific fields by name within the format string, and thus using just a string as an attribute (rather than a function call which would have a large impact on the CTF TSDL grammar). The Python format strings are very similar to its printf counterpart: they share a very similar format specification. A few examples of printf vs Python format string syntax similarity: Event payload layout: struct { int myint; string mystring; double mydouble; }; Printf: printf("abc %05d %s %g", myint, mystring, mydouble) Python: "abc {:05} {} {}" Printf: printf("%s: %x", mystring, myint) Python: "{mystring}: {myint:x}" * printers = [ "hexdump", "..." ] The tracer and trace viewer implementations can agree on a "printers" attribute, which describes pretty-printing plugins that shall be used to render the type, if such plugins are available. This includes implementing in the Babeltrace reference implementation support for such pretty-printing plugins, which can register to Babeltrace, and express which type or types they expect as input. A reference implementation of plugin will be implemented, which prints a sequence of bytes as: [...] 000015F0 5F 73 70 72 69 6E 74 66 5F 63 68 6B 00 5F 5F 78 _sprintf_chk.__x 00001600 73 74 61 74 00 6D 65 6D 6D 6F 76 65 00 5F 6F 62 stat.memmove._ob 00001610 73 74 61 63 6B 5F 62 65 67 69 6E 00 62 69 6E 64 stack_begin.bind [...] -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 18:39 ` [diamon-discuss] Common Trace Format 1.9 planning Mathieu Desnoyers @ 2015-03-25 19:09 ` Matthew Khouzam 2015-03-25 19:22 ` Mathieu Desnoyers 0 siblings, 1 reply; 8+ messages in thread From: Matthew Khouzam @ 2015-03-25 19:09 UTC (permalink / raw) To: diamon-discuss This is great, so v1.9 is a superset of 1.8. Would hints for time format be interesting or overspecialized? Would hardware description in the metadata in a standardized form be a good thing to have? Would including a state system specification in the tsdl be an interesting thing to do? Basically hints on how to process the events. Thanks! Matthew On 15-03-25 02:39 PM, Mathieu Desnoyers wrote: > Hi, > > As discussed on the workgroup call today, I am posting > the list of changes I have gathered for CTF 1.9. Feedback > is welcome! > > * Add new set of features to Common Trace Format and to the Babeltrace reference implementation. > > > 1. Handle transition from CTF 1.8 to 1.9, including compatibility and upgrade path for users. > > We must take great care when extending the Common Trace Format specification in non-fully backward compatible way, so that users do not suffer from this transition. We also want to minimize the frequency at which we do those non-compatible changes, by bundling those changes all within one release, and by ensuring, for the future, that the format can be extended with "features" that are simply ignored by a parser implementation that does not know about them. > > Since the description format contains version numbering, we can keep CTF 1.8 and CTF 1.9 parsers side-by-side in the trace reader, so users gathering CTF 1.8 traces can use the new tools to view them. Only users generating new trace format (CTF 1.9) will need the new tools to view them. > > > 2. Add base address and symbol information support. > > The Perf developers expressed strong interest for add base address and symbol information support within the Common Trace Format at the Trace Summit 2014 [3]. This requirement seems to be interesting for most of the community. > > In the spirit of keeping the Common Trace Format dedicated as much as possible to describe the layout of the binary trace, we intend to keep the base address and symbol information separate from the CTF metadata file. This can be achieved by creating a stream that contains the following object load and unload events: > > [ object load, timestamp, address space, address, mapping length, object path ] > [ object unload, timestamp, address space, address ] > > This should allow describing load/unload events for the kernel, and for each user-space process. > > Initially, we can introduce those events into LTTng and Perf-to-CTF in the same way through discussion. > > > 3. Add event versioning. > > Experience with the Linux kernel Tracepoints shows us that relying on event and field names is not sufficient for binding a trace analysis on the semantic of the kernel or application. Sometimes, changes in the software keep the same event and field name, but change their semantic. > > A good example of such a change is the wakeup delivery within the Linux kernel which has moved from the context of the thread performing the wake up to an inter-processor interrupt performing the wake-up on a remote CPU. This is an issue for critical path analysis, and it needs to know about this semantic change to the PID field, although the event and field names are unchanged. > > Therefore, introduce an optional versioning property for the event, which allow trace analysis and models to track which semantic it is tracking. > > > 4. Add field name prefix '$' to eliminate conflicts with reserved keywords. > > Currently, the Common Trace Format metadata uses a '_' prefix to mitigate conflicts with reserved keywords. However, it was a bad idea, because '_' is a legitimate character part of the identifiers. This creates some confusion in the cases where an event name is indeed > prefixed by '_' already: the underscore is either removed when it should not be removed, or '__' is then necessary. > > Fix this confusion by introducing the '$' optional prefix for identifiers. > $context actually looks better than _context > > 5. Add attributes for type and field reference. > > Turn the integer and floating point values description into attributes attached to the type: > > integer ( attribute-list ) > floating_point ( attribute-list ) > > This will allow using the same concept of attributes on compound types, e.g.: > > struct { ... } ( attribute-list ) > variant <tag> { ... } ( attribute-list ) > int32_t [10] ( attribute-list ) > > attribute-list is a comma-separated list of: > > identifier = expression > > For instance, in the case of event pretty-printing, this can be used as: > > format = "content of format string" > > Where the content of the format string follows what has been proposed in the report on layout description. > > > 6. Add pretty-printing "hints", specified outside of CTF, within either another specification, or an appendix. This will allow splitting data layout (CTF) from formatting. > > The goal here is to allow the tracer implementation (e.g. LTTng-UST) to specify additional attributes attached to a CTF type. The two attributes we will implement in LTTng-UST and Babeltrace are: > > * format = "{field1}: {field3:02x} ({event.context.allo})" > > The format string formatting would follow the Python format string syntax [4]. The choice of this syntax over printf-alike is due to the ability to refer to specific fields by name within the format string, and thus using just a string as an attribute (rather than a function call which would have a large impact on the CTF TSDL grammar). The Python format strings are very similar to its printf counterpart: they share a very similar format specification. > > A few examples of printf vs Python format string syntax similarity: > > Event payload layout: > > struct { > int myint; > string mystring; > double mydouble; > }; > > Printf: printf("abc %05d %s %g", myint, mystring, mydouble) > Python: "abc {:05} {} {}" > > Printf: printf("%s: %x", mystring, myint) > Python: "{mystring}: {myint:x}" > > > * printers = [ "hexdump", "..." ] > > The tracer and trace viewer implementations can agree on a "printers" attribute, which describes pretty-printing plugins that shall be used to render the type, if such plugins are available. > > This includes implementing in the Babeltrace reference implementation > support for such pretty-printing plugins, which can register to > Babeltrace, and express which type or types they expect as input. > > A reference implementation of plugin will be implemented, which prints a > sequence of bytes as: > > [...] > 000015F0 5F 73 70 72 69 6E 74 66 5F 63 68 6B 00 5F 5F 78 _sprintf_chk.__x > 00001600 73 74 61 74 00 6D 65 6D 6D 6F 76 65 00 5F 6F 62 stat.memmove._ob > 00001610 73 74 61 63 6B 5F 62 65 67 69 6E 00 62 69 6E 64 stack_begin.bind > [...] > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 19:09 ` Matthew Khouzam @ 2015-03-25 19:22 ` Mathieu Desnoyers 2015-03-25 19:40 ` Simon Marchi 0 siblings, 1 reply; 8+ messages in thread From: Mathieu Desnoyers @ 2015-03-25 19:22 UTC (permalink / raw) To: Matthew Khouzam; +Cc: diamon-discuss ----- Original Message ----- > This is great, so v1.9 is a superset of 1.8. No. A superset would be a completely backward compatible 1.9. This is not the case here. We plan to do incompatible changes within 1.9, hence the 1.8 and 1.9 parsers needed. Since the version is self-described, it should not be an issue to detect the CTF version from the trace metadata. > > Would hints for time format be interesting or overspecialized? I would keep that for presentation, either an appendix to CTF or a separate spec. After much discussion, we came to the conclusion that keeping CTF as a binary trace layout description language, and sticking to just that, seems to be a good way to keep the CTF specification minimal. This can be seen as splitting presentation from a web page content, where the presentation is in CSS, and the content in HTML. We can then optionally add "attributes" associated to types within the CTF metadata, but the semantic of those attributes would not be defined within CTF. If a trace reader does not understand or care about those attributes, they should simply be ignored. > > Would hardware description in the metadata in a standardized form be a > good thing to have? This is not needed to describe the binary trace layout, so this should be covered by another spec, or by a CTF appendix. > > Would including a state system specification in the tsdl be an > interesting thing to do? Basically hints on how to process the events. We could have a separate spec for this. We could then map those descriptions to the events. Again, this is not needed to decode the binary trace. The basic idea is that whatever is specified by CTF needs to be fully implemented by _all_ CTF readers, and completely covered, hence my intent to keep the optional stuff outside of this core spec. Thoughts ? Thanks, Mathieu > > Thanks! > > Matthew > > On 15-03-25 02:39 PM, Mathieu Desnoyers wrote: > > Hi, > > > > As discussed on the workgroup call today, I am posting > > the list of changes I have gathered for CTF 1.9. Feedback > > is welcome! > > > > * Add new set of features to Common Trace Format and to the Babeltrace > > reference implementation. > > > > > > 1. Handle transition from CTF 1.8 to 1.9, including compatibility and > > upgrade path for users. > > > > We must take great care when extending the Common Trace Format > > specification in non-fully backward compatible way, so that users do not > > suffer from this transition. We also want to minimize the frequency at > > which we do those non-compatible changes, by bundling those changes all > > within one release, and by ensuring, for the future, that the format can > > be extended with "features" that are simply ignored by a parser > > implementation that does not know about them. > > > > Since the description format contains version numbering, we can keep CTF > > 1.8 and CTF 1.9 parsers side-by-side in the trace reader, so users > > gathering CTF 1.8 traces can use the new tools to view them. Only users > > generating new trace format (CTF 1.9) will need the new tools to view > > them. > > > > > > 2. Add base address and symbol information support. > > > > The Perf developers expressed strong interest for add base address and > > symbol information support within the Common Trace Format at the Trace > > Summit 2014 [3]. This requirement seems to be interesting for most of the > > community. > > > > In the spirit of keeping the Common Trace Format dedicated as much as > > possible to describe the layout of the binary trace, we intend to keep the > > base address and symbol information separate from the CTF metadata file. > > This can be achieved by creating a stream that contains the following > > object load and unload events: > > > > [ object load, timestamp, address space, address, mapping length, object > > path ] > > [ object unload, timestamp, address space, address ] > > > > This should allow describing load/unload events for the kernel, and for > > each user-space process. > > > > Initially, we can introduce those events into LTTng and Perf-to-CTF in the > > same way through discussion. > > > > > > 3. Add event versioning. > > > > Experience with the Linux kernel Tracepoints shows us that relying on event > > and field names is not sufficient for binding a trace analysis on the > > semantic of the kernel or application. Sometimes, changes in the software > > keep the same event and field name, but change their semantic. > > > > A good example of such a change is the wakeup delivery within the Linux > > kernel which has moved from the context of the thread performing the wake > > up to an inter-processor interrupt performing the wake-up on a remote CPU. > > This is an issue for critical path analysis, and it needs to know about > > this semantic change to the PID field, although the event and field names > > are unchanged. > > > > Therefore, introduce an optional versioning property for the event, which > > allow trace analysis and models to track which semantic it is tracking. > > > > > > 4. Add field name prefix '$' to eliminate conflicts with reserved keywords. > > > > Currently, the Common Trace Format metadata uses a '_' prefix to mitigate > > conflicts with reserved keywords. However, it was a bad idea, because '_' > > is a legitimate character part of the identifiers. This creates some > > confusion in the cases where an event name is indeed > > prefixed by '_' already: the underscore is either removed when it should > > not be removed, or '__' is then necessary. > > > > Fix this confusion by introducing the '$' optional prefix for identifiers. > > > $context actually looks better than _context > > > > 5. Add attributes for type and field reference. > > > > Turn the integer and floating point values description into attributes > > attached to the type: > > > > integer ( attribute-list ) > > floating_point ( attribute-list ) > > > > This will allow using the same concept of attributes on compound types, > > e.g.: > > > > struct { ... } ( attribute-list ) > > variant <tag> { ... } ( attribute-list ) > > int32_t [10] ( attribute-list ) > > > > attribute-list is a comma-separated list of: > > > > identifier = expression > > > > For instance, in the case of event pretty-printing, this can be used as: > > > > format = "content of format string" > > > > Where the content of the format string follows what has been proposed in > > the report on layout description. > > > > > > 6. Add pretty-printing "hints", specified outside of CTF, within either > > another specification, or an appendix. This will allow splitting data > > layout (CTF) from formatting. > > > > The goal here is to allow the tracer implementation (e.g. LTTng-UST) to > > specify additional attributes attached to a CTF type. The two attributes > > we will implement in LTTng-UST and Babeltrace are: > > > > * format = "{field1}: {field3:02x} ({event.context.allo})" > > > > The format string formatting would follow the Python format string syntax > > [4]. The choice of this syntax over printf-alike is due to the ability to > > refer to specific fields by name within the format string, and thus using > > just a string as an attribute (rather than a function call which would > > have a large impact on the CTF TSDL grammar). The Python format strings > > are very similar to its printf counterpart: they share a very similar > > format specification. > > > > A few examples of printf vs Python format string syntax similarity: > > > > Event payload layout: > > > > struct { > > int myint; > > string mystring; > > double mydouble; > > }; > > > > Printf: printf("abc %05d %s %g", myint, mystring, mydouble) > > Python: "abc {:05} {} {}" > > > > Printf: printf("%s: %x", mystring, myint) > > Python: "{mystring}: {myint:x}" > > > > > > * printers = [ "hexdump", "..." ] > > > > The tracer and trace viewer implementations can agree on a "printers" > > attribute, which describes pretty-printing plugins that shall be used to > > render the type, if such plugins are available. > > > > This includes implementing in the Babeltrace reference implementation > > support for such pretty-printing plugins, which can register to > > Babeltrace, and express which type or types they expect as input. > > > > A reference implementation of plugin will be implemented, which prints a > > sequence of bytes as: > > > > [...] > > 000015F0 5F 73 70 72 69 6E 74 66 5F 63 68 6B 00 5F 5F 78 _sprintf_chk.__x > > 00001600 73 74 61 74 00 6D 65 6D 6D 6F 76 65 00 5F 6F 62 stat.memmove._ob > > 00001610 73 74 61 63 6B 5F 62 65 67 69 6E 00 62 69 6E 64 stack_begin.bind > > [...] > > > > _______________________________________________ > diamon-discuss mailing list > diamon-discuss@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 19:22 ` Mathieu Desnoyers @ 2015-03-25 19:40 ` Simon Marchi 2015-03-25 20:44 ` Mathieu Desnoyers 0 siblings, 1 reply; 8+ messages in thread From: Simon Marchi @ 2015-03-25 19:40 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: diamon-discuss On 15-03-25 03:22 PM, Mathieu Desnoyers wrote: > > This is great, so v1.9 is a superset of 1.8. > > No. A superset would be a completely backward compatible 1.9. > This is not the case here. We plan to do incompatible changes > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the > version is self-described, it should not be an issue to detect > the CTF version from the trace metadata. From what I understand, it means that a reader for 1.9 won't be able to read a 1.8 trace, is that right? Do the version numbers mean something? If introducing non backward compatible changes only bumps the minor version, what could ever bump the major? If they don't have one already, this could be an opportunity to give a meaning to the numbers. The major could be for breaking backward compatibility, while the minor could be for backward-compatible changes. It would mean that a reader for x.y should be able to read any x.z trace, where z <= y. In other words, the x.y format would be a superset of x.z (I think?). Much like semver.org, but without the PATCH level. Simon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 19:40 ` Simon Marchi @ 2015-03-25 20:44 ` Mathieu Desnoyers 2015-03-25 21:00 ` Mathieu Desnoyers 0 siblings, 1 reply; 8+ messages in thread From: Mathieu Desnoyers @ 2015-03-25 20:44 UTC (permalink / raw) To: Simon Marchi; +Cc: diamon-discuss ----- Original Message ----- > On 15-03-25 03:22 PM, Mathieu Desnoyers wrote: > > > This is great, so v1.9 is a superset of 1.8. > > > > No. A superset would be a completely backward compatible 1.9. > > This is not the case here. We plan to do incompatible changes > > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the > > version is self-described, it should not be an issue to detect > > the CTF version from the trace metadata. > > From what I understand, it means that a reader for 1.9 won't be able to read > a 1.8 trace, is that right? Yes, we plan to add incompatible grammar changes. This might have to be versioned as 2.0 then. However, we can have side-by-side implementations of CTF 1.8 and 2.0 readers within a trace reading lib, and therefore distinguish between those, and use the proper CTF reader implementation. > > Do the version numbers mean something? If introducing non backward compatible > changes only bumps the minor version, what could ever bump the major? Good question! In this case, we might be talking about a CTF 2.0 then, since we are planning non-compatible changes. > > If they don't have one already, this could be an opportunity to give a > meaning > to the numbers. The major could be for breaking backward compatibility, while > the minor could be for backward-compatible changes. It would mean that a > reader > for x.y should be able to read any x.z trace, where z <= y. In other words, > the > x.y format would be a superset of x.z (I think?). Much like semver.org, but > without the PATCH level. Yes, I think it's a good approach. We have been hesitating between moving from CTF 1.8 to either 1.9 or 2.0. Here are the upsides/downsides of each approach: * 1.8 to 1.9: + Compatibility: New CTF 1.9 readers would be able to read CTF 1.8 traces, - Incompatibility: CTF 1.8 readers would not be able to read CTF 1.9 traces, - Complexity: The CTF 1.9 specification would need to be an exact subset of 1.8, which means a more complex spec, grammar, and implementations, * 1.8 to 2.0: + Compability: Since we're keeping the version headers, a trace reader can implement parsers for both CTF 1.8 and 2.0, and read both trace formats without requiring user interaction, - Incompatibility: CTF 1.8 readers would not be able to read CTF 2.0 traces, + Simplicity: New CTF 2.0 readers would be simpler, since they would not need to read CTF 1.8 traces, Since the user-visible impacts of bumping from 1.8 to 1.9 or from 1.8 to 2.0 appear to be the same for existing implementations of the spec, I am really tempted to bump to 2.0 at this stage. Already having the version number in the header makes the transition so much easier to manage. Thoughts ? Thanks! Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 20:44 ` Mathieu Desnoyers @ 2015-03-25 21:00 ` Mathieu Desnoyers 2015-03-25 21:41 ` Philippe Proulx 0 siblings, 1 reply; 8+ messages in thread From: Mathieu Desnoyers @ 2015-03-25 21:00 UTC (permalink / raw) To: Simon Marchi, Philippe Proulx; +Cc: diamon-discuss Philippe might want to chime in. ----- Original Message ----- > ----- Original Message ----- > > On 15-03-25 03:22 PM, Mathieu Desnoyers wrote: > > > > This is great, so v1.9 is a superset of 1.8. > > > > > > No. A superset would be a completely backward compatible 1.9. > > > This is not the case here. We plan to do incompatible changes > > > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the > > > version is self-described, it should not be an issue to detect > > > the CTF version from the trace metadata. > > > > From what I understand, it means that a reader for 1.9 won't be able to > > read > > a 1.8 trace, is that right? > > Yes, we plan to add incompatible grammar changes. This might have to be > versioned as 2.0 then. However, we can have side-by-side implementations > of CTF 1.8 and 2.0 readers within a trace reading lib, and therefore > distinguish between those, and use the proper CTF reader implementation. > > > > > Do the version numbers mean something? If introducing non backward > > compatible > > changes only bumps the minor version, what could ever bump the major? > > Good question! In this case, we might be talking about a CTF 2.0 then, > since we are planning non-compatible changes. > > > > > If they don't have one already, this could be an opportunity to give a > > meaning > > to the numbers. The major could be for breaking backward compatibility, > > while > > the minor could be for backward-compatible changes. It would mean that a > > reader > > for x.y should be able to read any x.z trace, where z <= y. In other words, > > the > > x.y format would be a superset of x.z (I think?). Much like semver.org, but > > without the PATCH level. > > Yes, I think it's a good approach. > > We have been hesitating between moving from CTF 1.8 to either 1.9 or 2.0. > Here are the upsides/downsides of each approach: > > * 1.8 to 1.9: > + Compatibility: New CTF 1.9 readers would be able to read CTF 1.8 traces, > - Incompatibility: CTF 1.8 readers would not be able to read CTF 1.9 > traces, > - Complexity: The CTF 1.9 specification would need to be an exact subset > of 1.8, which means a more complex spec, grammar, and implementations, > > * 1.8 to 2.0: > + Compability: Since we're keeping the version headers, a trace reader > can implement parsers for both CTF 1.8 and 2.0, and read both trace > formats without requiring user interaction, > - Incompatibility: CTF 1.8 readers would not be able to read CTF 2.0 > traces, > + Simplicity: New CTF 2.0 readers would be simpler, since they would not > need to read CTF 1.8 traces, > > Since the user-visible impacts of bumping from 1.8 to 1.9 or from 1.8 to 2.0 > appear to be the same for existing implementations of the spec, I am really > tempted to bump to 2.0 at this stage. Already having the version number in > the header makes the transition so much easier to manage. > > Thoughts ? > > Thanks! > > Mathieu > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com > _______________________________________________ > diamon-discuss mailing list > diamon-discuss@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 21:00 ` Mathieu Desnoyers @ 2015-03-25 21:41 ` Philippe Proulx 2015-03-25 23:07 ` Philippe Proulx 0 siblings, 1 reply; 8+ messages in thread From: Philippe Proulx @ 2015-03-25 21:41 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: diamon-discuss ----- Original Message ----- > From: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com> > To: "Simon Marchi" <simon.marchi@ericsson.com>, "Philippe Proulx" <pproulx@efficios.com> > Cc: diamon-discuss@lists.linuxfoundation.org > Sent: Wednesday, 25 March, 2015 5:00:43 PM > Subject: Re: [diamon-discuss] Common Trace Format 1.9 planning > > Philippe might want to chime in. Yes, I might. Comments below. > > ----- Original Message ----- > > ----- Original Message ----- > > > On 15-03-25 03:22 PM, Mathieu Desnoyers wrote: > > > > > This is great, so v1.9 is a superset of 1.8. > > > > > > > > No. A superset would be a completely backward compatible 1.9. > > > > This is not the case here. We plan to do incompatible changes > > > > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the > > > > version is self-described, it should not be an issue to detect > > > > the CTF version from the trace metadata. > > > > > > From what I understand, it means that a reader for 1.9 won't be able to > > > read > > > a 1.8 trace, is that right? > > > > Yes, we plan to add incompatible grammar changes. This might have to be > > versioned as 2.0 then. However, we can have side-by-side implementations > > of CTF 1.8 and 2.0 readers within a trace reading lib, and therefore > > distinguish between those, and use the proper CTF reader implementation. > > > > > > > > Do the version numbers mean something? If introducing non backward > > > compatible > > > changes only bumps the minor version, what could ever bump the major? > > > > Good question! In this case, we might be talking about a CTF 2.0 then, > > since we are planning non-compatible changes. > > > > > > > > If they don't have one already, this could be an opportunity to give a > > > meaning > > > to the numbers. The major could be for breaking backward compatibility, > > > while > > > the minor could be for backward-compatible changes. It would mean that a > > > reader > > > for x.y should be able to read any x.z trace, where z <= y. In other > > > words, > > > the > > > x.y format would be a superset of x.z (I think?). Much like semver.org, > > > but > > > without the PATCH level. > > > > Yes, I think it's a good approach. > > > > We have been hesitating between moving from CTF 1.8 to either 1.9 or 2.0. > > Here are the upsides/downsides of each approach: > > > > * 1.8 to 1.9: > > + Compatibility: New CTF 1.9 readers would be able to read CTF 1.8 > > traces, > > - Incompatibility: CTF 1.8 readers would not be able to read CTF 1.9 > > traces, > > - Complexity: The CTF 1.9 specification would need to be an exact subset I guess you meant "superset" here. > > of 1.8, which means a more complex spec, grammar, and implementations, > > > > * 1.8 to 2.0: > > + Compability: Since we're keeping the version headers, a trace reader > > can implement parsers for both CTF 1.8 and 2.0, and read both trace > > formats without requiring user interaction, > > - Incompatibility: CTF 1.8 readers would not be able to read CTF 2.0 > > traces, > > + Simplicity: New CTF 2.0 readers would be simpler, since they would not > > need to read CTF 1.8 traces, > > > > Since the user-visible impacts of bumping from 1.8 to 1.9 or from 1.8 to > > 2.0 > > appear to be the same for existing implementations of the spec, I am really > > tempted to bump to 2.0 at this stage. Already having the version number in > > the header makes the transition so much easier to manage. > > > > Thoughts ? File formats have various ways of identifying their versions. The "serial" approach is often found (single version). One example is Microsoft's Bitmap format: <http://upload.wikimedia.org/wikipedia/commons/c/c4/BMPfileFormat.png>. Here, the DIB Header Size field indicates the size of the DIB Header, and depending on this size, a Bitmap file reader knows what fields are available. This header size may be considered as a serial version number. As the Bitmap format evolved over the years, new fields were added, and thus the DIB Header Size field value was increased each time. Old readers may still read newer Bitmap files here: they just skip the unknown fields thanks to the header size, and continue from there. The program will miss some information, but should still be able to decode the image. I think the major.minor approach should follow this backward-compatibility approach using the minor version: a CTF reader implemented by reading the CTF 1.8 specification, should be able to decode CTF 1.9 traces, CTF 1.10 traces, and so on, with no changes. It is expected that this same reader would not be able to decode CTF 2.0 traces (major bump). This means that, when a new minor version is released, features may only be _added_, and added in a way that makes sure older readers supporting older versions sharing the same major number can still decode the new format as they previously did. Just as a well-designed API may add features when bumping its minor version, but must ensure older applications will work with no changes using this new version. Here's an example: if hypothetical CTF 1.2 says: Integers are defined this way: integer { size = <size>; align = <alignment>; <other optional attributes here> } Other optional attributes may be ignored. and CTF 1.3 says: Integers are defined this way: integer { size = <size>; align = <alignment>; base = <base>; <other optional attributes here> } Other optional attributes may be ignored. Then, a CTF 1.2 reader would be able to read a CTF 1.3 trace, but it would ignore the "base" attribute (which falls into an "other optional attribute" as per CTF 1.2) and display all integers in base 10. A CTF 1.3 reader would know this "base" field, and treat it specially, displaying integers with the provided radix. The idea is letting some free space in the specification, like this "<other optional attributes here>", for future minor revisions of the format. In binary formats, this is usually done with a fixed-offset field providing the header size, like Bitmap's DIB Header Size field. As long as the purpose of this fixed-offset field is known in the first minor version of a given major version, future, custom fields may be added at will without breaking older decoders. For a text-based format like TSDL, letting some free space means relaxing the grammar so that unknown blocks, fields or attributes are still parseable, but have no attached semantics (yet). tl;dr: Unless the current version of Babeltrace/TraceCompass will be able to read the next version of CTF with no change, it should be CTF 2.0. Phil > > > > Thanks! > > > > Mathieu > > > > -- > > Mathieu Desnoyers > > EfficiOS Inc. > > http://www.efficios.com > > _______________________________________________ > > diamon-discuss mailing list > > diamon-discuss@lists.linuxfoundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss > > > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [diamon-discuss] Common Trace Format 1.9 planning 2015-03-25 21:41 ` Philippe Proulx @ 2015-03-25 23:07 ` Philippe Proulx 0 siblings, 0 replies; 8+ messages in thread From: Philippe Proulx @ 2015-03-25 23:07 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: diamon-discuss ----- Original Message ----- > From: "Philippe Proulx" <pproulx@efficios.com> > To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com> > Cc: "Simon Marchi" <simon.marchi@ericsson.com>, diamon-discuss@lists.linuxfoundation.org > Sent: Wednesday, 25 March, 2015 5:41:47 PM > Subject: Re: [diamon-discuss] Common Trace Format 1.9 planning > > ----- Original Message ----- > > From: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com> > > To: "Simon Marchi" <simon.marchi@ericsson.com>, "Philippe Proulx" > > <pproulx@efficios.com> > > Cc: diamon-discuss@lists.linuxfoundation.org > > Sent: Wednesday, 25 March, 2015 5:00:43 PM > > Subject: Re: [diamon-discuss] Common Trace Format 1.9 planning > > > > Philippe might want to chime in. > > Yes, I might. > > Comments below. > > > > > ----- Original Message ----- > > > ----- Original Message ----- > > > > On 15-03-25 03:22 PM, Mathieu Desnoyers wrote: > > > > > > This is great, so v1.9 is a superset of 1.8. > > > > > > > > > > No. A superset would be a completely backward compatible 1.9. > > > > > This is not the case here. We plan to do incompatible changes > > > > > within 1.9, hence the 1.8 and 1.9 parsers needed. Since the > > > > > version is self-described, it should not be an issue to detect > > > > > the CTF version from the trace metadata. > > > > > > > > From what I understand, it means that a reader for 1.9 won't be able to > > > > read > > > > a 1.8 trace, is that right? > > > > > > Yes, we plan to add incompatible grammar changes. This might have to be > > > versioned as 2.0 then. However, we can have side-by-side implementations > > > of CTF 1.8 and 2.0 readers within a trace reading lib, and therefore > > > distinguish between those, and use the proper CTF reader implementation. > > > > > > > > > > > Do the version numbers mean something? If introducing non backward > > > > compatible > > > > changes only bumps the minor version, what could ever bump the major? > > > > > > Good question! In this case, we might be talking about a CTF 2.0 then, > > > since we are planning non-compatible changes. > > > > > > > > > > > If they don't have one already, this could be an opportunity to give a > > > > meaning > > > > to the numbers. The major could be for breaking backward compatibility, > > > > while > > > > the minor could be for backward-compatible changes. It would mean that > > > > a > > > > reader > > > > for x.y should be able to read any x.z trace, where z <= y. In other > > > > words, > > > > the > > > > x.y format would be a superset of x.z (I think?). Much like semver.org, > > > > but > > > > without the PATCH level. > > > > > > Yes, I think it's a good approach. > > > > > > We have been hesitating between moving from CTF 1.8 to either 1.9 or 2.0. > > > Here are the upsides/downsides of each approach: > > > > > > * 1.8 to 1.9: > > > + Compatibility: New CTF 1.9 readers would be able to read CTF 1.8 > > > traces, > > > - Incompatibility: CTF 1.8 readers would not be able to read CTF 1.9 > > > traces, > > > - Complexity: The CTF 1.9 specification would need to be an exact > > > subset > > I guess you meant "superset" here. > > > > of 1.8, which means a more complex spec, grammar, and > > > implementations, > > > > > > * 1.8 to 2.0: > > > + Compability: Since we're keeping the version headers, a trace reader > > > can implement parsers for both CTF 1.8 and 2.0, and read both trace > > > formats without requiring user interaction, > > > - Incompatibility: CTF 1.8 readers would not be able to read CTF 2.0 > > > traces, > > > + Simplicity: New CTF 2.0 readers would be simpler, since they would > > > not > > > need to read CTF 1.8 traces, > > > > > > Since the user-visible impacts of bumping from 1.8 to 1.9 or from 1.8 to > > > 2.0 > > > appear to be the same for existing implementations of the spec, I am > > > really > > > tempted to bump to 2.0 at this stage. Already having the version number > > > in > > > the header makes the transition so much easier to manage. > > > > > > Thoughts ? > > File formats have various ways of identifying their versions. > > The "serial" approach is often found (single version). One example is > Microsoft's Bitmap format: > <http://upload.wikimedia.org/wikipedia/commons/c/c4/BMPfileFormat.png>. > Here, the DIB Header Size field indicates the size of the DIB Header, and > depending on this size, a Bitmap file reader knows what fields are available. > This header size may be considered as a serial version number. As the Bitmap > format evolved over the years, new fields were added, and thus the DIB Header > Size field value was increased each time. Old readers may still read newer > Bitmap files here: they just skip the unknown fields thanks to the header > size, > and continue from there. The program will miss some information, but should > still be able to decode the image. > > I think the major.minor approach should follow this backward-compatibility > approach using the minor version: a CTF reader implemented by reading the > CTF 1.8 specification, should be able to decode CTF 1.9 traces, CTF 1.10 > traces, and so on, with no changes. It is expected that this same reader > would not be able to decode CTF 2.0 traces (major bump). After some discussions with Mathieu, here's what we agreed. Let my definition of the minor version (older reader able to read newer minor versions) become the _patch version_. This means: older readers are able to read newer patch versions (same major and minor versions). Let my definition of the major version (complete break, including changing the TSDL grammar) be split in two: * Minor version: changing the TSDL grammar is allowed, as long as the changes form a superset of the previous grammar (the previous grammar is completely included within the new one). This obviously means that a CTF 2.4 reader (which doesn't know CTF 2.5) won't be able to open a CTF 2.5 trace, since the grammar could have changed. However, a CTF 2.5 reader is necessarily able to read CTF 2.4 traces since the grammar of CTF 2.5 is a superset of CTF 2.4's. * Major version: complete compatibility break, possibly including removing/altering parts of the grammar. A CTF 3.x reader is not necessarily able to read CTF 2.x traces, since the grammars of CTF 2.x and CTF 3.x could form disjoint sets. This approach makes sure that the major number will not change often, since adding features to the grammar, as long as it's a superset of the previous one, only bumps the minor version. Of course, our efforts for the initial 2.0.0 version will focus on making sure the new grammar is relaxed enough to reserve room for future improvements that do not need a change in the grammar (patch version bump). Thus a CTF 2.0.0 reader will be able to parse CTF 2.0.1 traces, CTF 2.0.2 traces, and so on, with no changes. For example, custom type attributes could be prefixed (namespaced), so that new specified attributes may be added without changing the grammar, and without interfering with custom (prefixed) attributes, e.g. (hypothetical syntax): integer < size: 2, align: 4, x-my-custom-attribute: 0x1000, x-my-other-custom-attr: "something", attribute: 42 /* <-- May be added in CTF 2.0.1; CTF 2.0.0 readers will ignore it, while CTF 2.0.1+ readers will treat it specially (cannot be a custom attribute, because it's not "x-attribute"). */ > So, again, since we plan major alterations of CTF 1.8's grammar, i.e. CTF 1.9's grammar won't be a superset of the current one, CTF 2.0 is considered the best next release version number for the moment. Phil > > This means that, when a new minor version is released, features may only be > _added_, and added in a way that makes sure older readers supporting older > versions sharing the same major number can still decode the new format as > they > previously did. Just as a well-designed API may add features when bumping > its minor version, but must ensure older applications will work with no > changes using this new version. > > Here's an example: if hypothetical CTF 1.2 says: > > Integers are defined this way: > > integer { > size = <size>; > align = <alignment>; > <other optional attributes here> > } > > Other optional attributes may be ignored. > > and CTF 1.3 says: > > Integers are defined this way: > > integer { > size = <size>; > align = <alignment>; > base = <base>; > <other optional attributes here> > } > > Other optional attributes may be ignored. > > Then, a CTF 1.2 reader would be able to read a CTF 1.3 trace, but it > would ignore the "base" attribute (which falls into an "other optional > attribute" as per CTF 1.2) and display all integers in base 10. > > A CTF 1.3 reader would know this "base" field, and treat it specially, > displaying integers with the provided radix. > > The idea is letting some free space in the specification, like this > "<other optional attributes here>", for future minor revisions of the > format. In binary formats, this is usually done with a fixed-offset field > providing the header size, like Bitmap's DIB Header Size field. As > long as the purpose of this fixed-offset field is known in the first minor > version of a given major version, future, custom fields may be added > at will without breaking older decoders. For a text-based format like > TSDL, letting some free space means relaxing the grammar so that > unknown blocks, fields or attributes are still parseable, but have no > attached semantics (yet). > > tl;dr: Unless the current version of Babeltrace/TraceCompass will be > able to read the next version of CTF with no change, it should be > CTF 2.0. > > Phil > > > > > > > Thanks! > > > > > > Mathieu > > > > > > -- > > > Mathieu Desnoyers > > > EfficiOS Inc. > > > http://www.efficios.com > > > _______________________________________________ > > > diamon-discuss mailing list > > > diamon-discuss@lists.linuxfoundation.org > > > https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss > > > > > > > -- > > Mathieu Desnoyers > > EfficiOS Inc. > > http://www.efficios.com > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-03-25 23:07 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <470558667.103090.1427308587048.JavaMail.zimbra@efficios.com>
2015-03-25 18:39 ` [diamon-discuss] Common Trace Format 1.9 planning Mathieu Desnoyers
2015-03-25 19:09 ` Matthew Khouzam
2015-03-25 19:22 ` Mathieu Desnoyers
2015-03-25 19:40 ` Simon Marchi
2015-03-25 20:44 ` Mathieu Desnoyers
2015-03-25 21:00 ` Mathieu Desnoyers
2015-03-25 21:41 ` Philippe Proulx
2015-03-25 23:07 ` Philippe Proulx
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.