From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932170AbWJERHF (ORCPT ); Thu, 5 Oct 2006 13:07:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932172AbWJERHF (ORCPT ); Thu, 5 Oct 2006 13:07:05 -0400 Received: from tomts22.bellnexxia.net ([209.226.175.184]:29065 "EHLO tomts22-srv.bellnexxia.net") by vger.kernel.org with ESMTP id S932170AbWJERHB (ORCPT ); Thu, 5 Oct 2006 13:07:01 -0400 Date: Thu, 5 Oct 2006 13:01:32 -0400 From: Mathieu Desnoyers To: Steven Rostedt Cc: LKML , Ingo Molnar , Thomas Gleixner , Karim Yaghmour , Andrew Morton , Chris Wright , fche@redhat.com, Tom Zanussi Subject: Re: [RFC] The New and Improved Logdev (now with kprobes!) Message-ID: <20061005170132.GA11149@Krystal> References: <1160025104.6504.30.camel@localhost.localdomain> <20061005143133.GA400@Krystal> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline In-Reply-To: X-Editor: vi X-Info: http://krystal.dyndns.org:8080 X-Operating-System: Linux/2.4.32-grsec (i686) X-Uptime: 12:08:43 up 43 days, 13:17, 5 users, load average: 0.23, 0.22, 0.28 User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org * Steven Rostedt (rostedt@goodmis.org) wrote: > > It would be great to have this logging information recorded into a standardized > > buffer format so it could be analyzed with data gathered by other > > instrumentation. Instead of using Tom's relay mechanism directly, you might > > want to have a look at LTTng (http://ltt.polymtl.ca) : it would be a simple > > matter of describing your own facility (group of event), the data types they > > record, run genevent (serialization code generator) and call those > > serialization functions when you want to record to the buffers from logdev. > > Hmm, interesting. But at the mean time, what you describe seems a little > out of scope with logdev. This doesn't mean that it can't be applied, now > or later. But currently, I do use logdev for 90% debugging and 10% > analyzing. Perhaps for the analyzing part, this would be useful. I have > to admit, I didn't get far trying to convert LTTng to 2.6.18. Didn't have > the time. Ah, I see you have a patch there now for 2.6.18. Adding this > would be good to do. But unfortunately, my time is currently very limited > (who's isn't. But mine currently is more limited than it usually is). > Usage of LTTng that I am aware of are not limited to analysis : some users, Autodesk for instance, use its user space tracing capabilities extensively to find deadlocks and deadline misses in their video applications. That I have found is that having both some general overview of the system in the same trace where the debugging information sits is a very powerful aid to developers. > When things slow down for me a little, I'll see where you are at, and take > a look. Something we can also discuss at the next OLS. > Sure, I'll be glad to discuss about it. > To logdev, speed of the trace is important, but not that important. > Accuracy of the trace is the most important. Originally, I had a single > buffer, and would use spinlocks to protect it. All CPUs would share this > buffer. The reason for this, is I wanted simple code to prove that the > sequence events really did happen in a certain order. I just recently > changed the ring buffer to use a lockless buffer per cpu, but I still > question it's accuracy. But I guess it does make things faster now. > That's why I directly use the timestamp counter (when synchronized) of the CPUs. I do not rely on the kernel time base when it is not needed. As I use the timestamps to merge the events from the multiple buffers, they must be as accurate as possible. > > > > I think it would be great to integrate those infrastructures together so we can > > easily merge information coming from various sources (markers, logdev, systemTAP > > scripts, LKET). > > The one argument I have against this, is that some of these have different > objectives. Merging too much can dilute the objective of the app. But I > do think that a cooperation between the tools would be nice. > Yes, I don't think that it sould become "one" big project, just that each project should be able to interface with others. > I know I said I'm staying out of the debate, but I need to ask this > anyway. Couldn't LTTng be fully implemented with dynamic traces? And if > so, then what would be the case, to get that into the kernel, and then > maintain a separate patch to convert those dynamic traces into static > onces where performance is critical. This way, you can get the > infrastructure into the kernel, and get more eyes on it. Also make the > patch smaller. > It its current state, LTTng is already splitted into such pieces. The parts that are the most highly reusable are : - Code markup mechanism (markers) - Serialization mechanism (facilities) within probes (ltt-probes kernel modules) dynamically connected to markers. - Tracing control mechanism (ltt-tracer, ltt-control) - Buffer management mechanism (ltt-relay) To answer your question, I will distinguish elements of this "dynamic" term that is so widely used : * Dynamic probe connexion LTTng 0.6.0 now supports dynamic probe connexion on the markers. A probe is a dynamically loadable kernel module. It supports load/unload of these modules. * Dynamic registration of new events/event record types LTTng supports such dynamic registration since the 0.5.x series. * Probe placement What makes debugging information based probe placement unsuitable as the only option for LTTng : - inability to extract all the local variables - performance impact - inability to follow the kernel code changes as well as a marker inserted in the code itself. Regards, Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68