From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756685Ab0ELQrE (ORCPT ); Wed, 12 May 2010 12:47:04 -0400 Received: from mail-fx0-f46.google.com ([209.85.161.46]:39093 "EHLO mail-fx0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756630Ab0ELQrB (ORCPT ); Wed, 12 May 2010 12:47:01 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=XJYGcVAdcjeQBowfWraLCZvVtxPZvc1lCXlG0kw9FUZwRO16pV6UfjoAN+toTc/92R C7wXD9mjSP5K1wsHIT9zLsUEoomKaldktQpwzBPbYGWcIbfZ9P3TjiGMQBYFH8969kCh v7jbejrLMOd3T/U0Hmp2kicMCshMLvjQWMaHo= Date: Wed, 12 May 2010 18:46:57 +0200 From: Frederic Weisbecker To: Steven Rostedt Cc: Pierre Tardy , Ingo Molnar , Arnaldo Carvalho de Melo , Peter Zijlstra , Tom Zanussi , Paul Mackerras , linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com, arjan@infradead.org, ziga.mahkovec@gmail.com Subject: Perf and ftrace [was Re: PyTimechart] Message-ID: <20100512164650.GH5405@nowhere> References: <20100511213625.GD5422@nowhere> <20100512144811.GA5405@nowhere> <1273678596.27703.30.camel@gandalf.stny.rr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1273678596.27703.30.camel@gandalf.stny.rr.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 12, 2010 at 11:36:36AM -0400, Steven Rostedt wrote: > On Wed, 2010-05-12 at 16:48 +0200, Frederic Weisbecker wrote: > > On Wed, May 12, 2010 at 03:37:27PM +0200, Pierre Tardy wrote: > > > But we don't yet support trace_printk in perf. May be we could wrap > > them in trace events. > > Hmm, do we really want to do that? > > We really need to get the perf and ftrace trace buffers combined. I > understand why perf chose to do the mmap buffers for the counting I don't think that's the reason. I mean that's the reason for every perf tools that live record and analyse events as they come (perf top, perf stat). But there is no strong reason for perf record not to use splice, a part the fact that perf doesn't support splice. > but > for live streaming, it is very inefficient compared to splice. Yeah, totally agreed. I'm looking forward the day we'll have a ring buffer that can be either lockless per-cpu or support contention, and that can be spliced, mmap'ed and read, and that supports overwriting mode. So that we can unify all this mess between perf and ftrace. But note splice is only part of the problem, eventually not the biggest one for now (but it is one important): perf starts to show its weaknesses now that we are playing with lock events (by nature high freq events). This is mostly due to the fact we are doing a round pass on all per cpu mmap'ed buffers. The time you handle an event buffer, you've already lost a lot of events from another one. trace-cmd is certainly much more efficient in this regard (one thread per cpu splicing one file per cpu), atlhough less convenient for cross analysis as you need to handle several files. perf record works well with every events but lock ones. I plan to try something like a perf multiplex: one thread per cpu that reads the mmap'ed buffers and write in its own file, and in the end you gather the whole in a single one. This will solve the first and problem: this scheme will probably catch up with 80% of trace-cmd efficiency, until we get a true splice support. In fact, I hope trace-cmd will come to be merged in tools/, I'm not worried anymore about having two different tools that do the same things wrt tracing, because I think they will eventually get merged together step by step: the format parsing API, kernelshark, sched/lock/timechart/kmem/etc... tools. And sharing the same buffer will probably announce the final merge between both, with a single and strong tracing tool set. > I would hate to add more duplicate code to have perf support > trace_printk(). No, having trace_printk() implemented on top on trace events is a win on both sides: we can toggle their activation, filter, have their format, etc... The duplication would only reside in the tracing callback, and as a temporary thing like the others until we finally have this common buffer. Thanks.