[diamon-discuss] Common Trace Format 1.9 planning

* [diamon-discuss] Common Trace Format 1.9 planning
       [not found] <470558667.103090.1427308587048.JavaMail.zimbra@efficios.com>
@ 2015-03-25 18:39 ` Mathieu Desnoyers
  2015-03-25 19:09   ` Matthew Khouzam
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2015-03-25 18:39 UTC (permalink / raw)
  To: diamon-discuss

Hi,

As discussed on the workgroup call today, I am posting
the list of changes I have gathered for CTF 1.9. Feedback
is welcome!

* Add new set of features to Common Trace Format and to the Babeltrace reference implementation.

1. Handle transition from CTF 1.8 to 1.9, including compatibility and upgrade path for users.

We must take great care when extending the Common Trace Format specification in non-fully backward compatible way, so that users do not suffer from this transition. We also want to minimize the frequency at which we do those non-compatible changes, by bundling those changes all within one release, and by ensuring, for the future, that the format can be extended with "features" that are simply ignored by a parser implementation that does not know about them.

Since the description format contains version numbering, we can keep CTF 1.8 and CTF 1.9 parsers side-by-side in the trace reader, so users gathering CTF 1.8 traces can use the new tools to view them. Only users generating new trace format (CTF 1.9) will need the new tools to view them.

2. Add base address and symbol information support.

The Perf developers expressed strong interest for add base address and symbol information support within the Common Trace Format at the Trace Summit 2014 [3]. This requirement seems to be interesting for most of the community.

In the spirit of keeping the Common Trace Format dedicated as much as possible to describe the layout of the binary trace, we intend to keep the base address and symbol information separate from the CTF metadata file. This can be achieved by creating a stream that contains the following object load and unload events:

[ object load, timestamp, address space, address, mapping length, object path ]
[ object unload, timestamp, address space, address ]

This should allow describing load/unload events for the kernel, and for each user-space process.

Initially, we can introduce those events into LTTng and Perf-to-CTF in the same way through discussion.

3. Add event versioning.

Experience with the Linux kernel Tracepoints shows us that relying on event and field names is not sufficient for binding a trace analysis on the semantic of the kernel or application. Sometimes, changes in the software keep the same event and field name, but change their semantic.

A good example of such a change is the wakeup delivery within the Linux kernel which has moved from the context of the thread performing the wake up to an inter-processor interrupt performing the wake-up on a remote CPU. This is an issue for critical path analysis, and it needs to know about this semantic change to the PID field, although the event and field names are unchanged.

Therefore, introduce an optional versioning property for the event, which allow trace analysis and models to track which semantic it is tracking.

4. Add field name prefix '$' to eliminate conflicts with reserved keywords.

Currently, the Common Trace Format metadata uses a '_' prefix to mitigate conflicts with reserved keywords. However, it was a bad idea, because '_' is a legitimate character part of the identifiers. This creates some confusion in the cases where an event name is indeed
prefixed by '_' already: the underscore is either removed when it should not be removed, or '__' is then necessary.

Fix this confusion by introducing the '$' optional prefix for identifiers.

5. Add attributes for type and field reference.

Turn the integer and floating point values description into attributes attached to the type:

integer ( attribute-list )
floating_point ( attribute-list )

This will allow using the same concept of attributes on compound types, e.g.:

struct { ... } ( attribute-list )
variant <tag> { ... } ( attribute-list )
int32_t [10] ( attribute-list )

attribute-list is a comma-separated list of:

  identifier = expression

For instance, in the case of event pretty-printing, this can be used as:

  format = "content of format string"

Where the content of the format string follows what has been proposed in the report on layout description.

6. Add pretty-printing "hints", specified outside of CTF, within either another specification, or an appendix. This will allow splitting data layout (CTF) from formatting.

The goal here is to allow the tracer implementation (e.g. LTTng-UST) to specify additional attributes attached to a CTF type. The two attributes we will implement in LTTng-UST and Babeltrace are:

  * format = "{field1}: {field3:02x} ({event.context.allo})"

The format string formatting would follow the Python format string syntax [4]. The choice of this syntax over printf-alike is due to the ability to refer to specific fields by name within the format string, and thus using just a string as an attribute (rather than a function call which would have a large impact on the CTF TSDL grammar). The Python format strings are very similar to its printf counterpart: they share a very similar format specification.

A few examples of printf vs Python format string syntax similarity:

Event payload layout:

struct {
        int myint;
        string mystring;
        double mydouble;
};

Printf: printf("abc %05d %s %g", myint, mystring, mydouble)
Python: "abc {:05} {} {}"

Printf: printf("%s: %x", mystring, myint)
Python: "{mystring}: {myint:x}"

  * printers = [ "hexdump", "..." ]

The tracer and trace viewer implementations can agree on a "printers" attribute, which describes pretty-printing plugins that shall be used to render the type, if such plugins are available.

This includes implementing in the Babeltrace reference implementation
support for such pretty-printing plugins, which can register to
Babeltrace, and express which type or types they expect as input.

A reference implementation of plugin will be implemented, which prints a
sequence of bytes as:

[...]
000015F0 5F 73 70 72 69 6E 74 66 5F 63 68 6B 00 5F 5F 78 _sprintf_chk.__x
00001600 73 74 61 74 00 6D 65 6D 6D 6F 76 65 00 5F 6F 62 stat.memmove._ob
00001610 73 74 61 63 6B 5F 62 65 67 69 6E 00 62 69 6E 64 stack_begin.bind
[...]

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread