From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <54C27D24.8030100@versatic.net> Date: Fri, 23 Jan 2015 11:56:04 -0500 From: =?windows-1252?Q?Genevi=E8ve_Bastien?= MIME-Version: 1.0 References: <54C215D6.1030804@huawei.com> <54C2770F.4080305@voxpopuli.im> In-Reply-To: <54C2770F.4080305@voxpopuli.im> Content-Type: text/plain; charset="windows-1252"; format="flowed" Content-Transfer-Encoding: quoted-printable Subject: Re: [diamon-discuss] [lttng-dev] My experience on perf, CTF and TraceCompass, and some suggection. List-Id: DiaMon diagnostic and monitoring workgroup general discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wang Nan Cc: "diamon-discuss@lists.linuxfoundation.org" , lttng-dev@lists.lttng.org, Naser Ezzati , tracecompass developer discussions Hi Wang, Thanks for sharing your experience. It's always useful to have some real=20 live use case of using the tools. Alex already made a quite complete answer. I'll just add some information about your wish for Ad-Hoc visualization=20 and statistics. As Alex said, data driven analysis, using XML files is=20 already present in Trace Compass. Documentation on how to use it is=20 available here:=20 https://wiki.eclipse.org/Linux_Tools_Project/LTTng2/User_Guide#Data_driven_= analysis You can build your own analysis with any event type. The current support=20 is rather basic, you need to write the XML by yourself, starting from a=20 template and it supports only XY charts and time graph views. Current=20 work by students at Polytechnique in the data-driven analysis involves=20 defining data-driven custom filters for events and views, developing a=20 visual UI to build an analysis from a state diagram, supporting more use=20 cases of analysis from the event data. I'm cc'ing Naser Ezzati, who's=20 working currently on the XML analysis. He's been working among other=20 things on custom statistics, I don't know what's the status of this=20 development, but he may point you to his development branch, if it's=20 ready, so you can see if it fits your current need. If you want some more details on the data-driven analysis work being=20 done at Poly right now, you can look at the presentations by Naser=20 Ezzati, Jean-Christian Kouam=E9 and Simon Delisle on this page:=20 https://ahls.dorsal.polymtl.ca/dec2014. If you're interested in trying=20 out their [still experimental] work, let us know and we'll see if there=20 is an experimental working branch you could try. Cheers, Genevi=E8ve On 01/23/2015 11:30 AM, Alexandre Montplaisir wrote: > Hi Wang, > > First of all, thank you very much for posting this use case. This is=20 > exactly the type of user feedback that will help make the toolchain=20 > better and more useful for users! > > Some comments and questions below, > > > On 01/23/2015 04:35 AM, Wang Nan wrote: >> [...] >> >> Then I need to convert perf.data to ctf. It tooks 140.57s to convert >> 2598513 samples, which are collected during only 1 second execution. My >> working server has 64 2.0GHz Intel Xeon cores, but perf conversion >> utilizes only 1 of them. I think this is another thing can be improved. > > Out of curiosity, approximately how big (in bytes) is the generated=20 > CTF trace directory? > >> >> The next step is visualization. Output ctf trace can be opened with >> TraceCompass without problem. The most important views for me should be >> resources view (I use them to check CPU usage) and control flow view (I >> use them to check thread activities). >> >> The first uncomfortable thing is TraceCompass' slow response time. For >> the trace I mentioned above, on resource view, after I click on CPU >> idle area, I have to wait more than 10 seconds for event list updating >> to get the previous event before the idle area. > > Interesting. It is expected that opening a very large trace would take=20 > a long time to load the first time, as everything gets indexed. But=20 > once that step is done, seeking within the trace should be relatively=20 > quick ((log n) wrt to the trace size). In theory ;) > > The perf-to-CTF conversion brings a completely new type of CTF traces=20 > that was not seen before. It is possible that the CTF parser in Trace=20 > Compass has some inefficiencies that were not exposed by other trace=20 > types. Are you able to share that trace publicly? Or a trace taken in=20 > the same environment, with no sensible information in it? It could be=20 > very helpful in finding such problem. > >> Then I found through resources view that perf itself tooks lots of CPU >> time. In my case 33.5% samples are generated by perf itself. One core is >> dedicated to perf and never idle or taken by others. I think this should >> be another thing needs to be improved: perf should give a way to >> blacklist itself when tracing all CPUs. > > I don't want to start a tracer-war here :) but have you investigated=20 > using LTTng for recording syscall/sched events ? Compared to perf,=20 > LTTng is only about "getting trace events", and is a bit more involved=20 > to set up, but it is more focused on performance and minimizing the=20 > impact on the traced applications. And it outputs in CTF format too. > > I remember when testing the perf-CTF patches, comparing a perf trace=20 > to an LTTng one, perf would be doing system calls continuously on one=20 > of the CPUs for the whole duration of the trace. Whereas in LTTng=20 > traces, the session daemon would be a bit active at the beginning and=20 > at then end, but otherwise completely invisible from the trace. > >> TraceCompass doesn't recognize syscall:* tracepoints as CPU status >> changing point. I have to also catch raw_syscall:*, and which doubles >> the number of samples. > > This is a gap in the definition of the analysis it seems. I don't=20 > remember implementing two types of "syscall" events in the perf=20 > analysis, so it should just be a matter of getting the exact event=20 > name and adding it to the list. I will take a look and keep you posted! > >> Finally I found the syscall which cause idle. However I need to write a >> script to do statistics. TraceCompass itself is lack a mean to count >> different events in my way. > > Could you elaborate on this please? I agree the "Statistics" view in=20 > TC is severely lacking, we could be gathering and displaying much more=20 > information. The only question is what information would actually be=20 > useful. > > What exactly would you have liked to be able to see in the tool? > >> [...] >> >> >> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only >> support dwaring pre-defined events and processes. When I try to >> capture syscalls:*, I won't get benefit from TraceCompass=20 >> because it >> doesn't know them. I believe that during system tuning we will >> finally get somewhere unable to be pre-defined by TraceCompass >> designer. Therefore give users abilities to define their own events >> and model should be much helpful. > > As I mentioned earlier, the pre-defined "perf analysis" in Trace=20 > Compass should be fixed to handle the syscall events. > > > But it's interesting that you mention wanting to add your own events=20 > and model. I completely agree with you, we will never be able to=20 > predict every and all use cases the users will want to use the tool=20 > for, so there should be a way for the user to add their own. > > Well good news, it *is* possible for the user to define their own=20 > analysis and views! This is still undergoing a lot of development, and=20 > there is no nice UI yet, which is why it is not really advertized. But=20 > starting from any supported trace type, a user today can define a time=20 > graph view (like the Resource View for example) or a XY chart, using a=20 > data-driven XML syntax. > > If you are curious, you can take a look at a full example of doing=20 > such a thing on this page: > https://github.com/alexmonthy/ust-tc-example > (the example uses an LTTng UST trace as a source, but it could work=20 > with any supported trace type, even a custom text trace defined in the=20 > UI). > >> >> Thank you. > > Thanks again for taking the time to write about your experience! > > Cheers, > Alexandre > > > _______________________________________________ > lttng-dev mailing list > lttng-dev@lists.lttng.org > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev