From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752735AbbGMOL7 (ORCPT ); Mon, 13 Jul 2015 10:11:59 -0400 Received: from mail-pa0-f54.google.com ([209.85.220.54]:33716 "EHLO mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752632AbbGMOL3 (ORCPT ); Mon, 13 Jul 2015 10:11:29 -0400 Date: Mon, 13 Jul 2015 23:09:15 +0900 From: Namhyung Kim To: pi3orama Cc: He Kuang , Alexei Starovoitov , "rostedt@goodmis.org" , "masami.hiramatsu.pt@hitachi.com" , "acme@kernel.org" , "a.p.zijlstra@chello.nl" , "mingo@redhat.com" , "jolsa@kernel.org" , "wangnan0@huawei.com" , "linux-kernel@vger.kernel.org" Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event Message-ID: <20150713140915.GD9917@danjae.kornet> References: <1436522587-136825-1-git-send-email-hekuang@huawei.com> <1436522587-136825-4-git-send-email-hekuang@huawei.com> <55A042DC.6030809@plumgrid.com> <55A3404B.6020904@huawei.com> <20150713135223.GB9917@danjae.kornet> <4D441676-21A7-46EE-AAB0-EB529D408082@163.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4D441676-21A7-46EE-AAB0-EB529D408082@163.com> User-Agent: Mutt/1.5.23+89 (0255b37be491) (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote: > > > 发自我的 iPhone > > > 在 2015年7月13日,下午9:52,Namhyung Kim 写道: > > > > Hi, > > > >> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote: > >> hi, Alexei > >> > >>> On 2015/7/11 6:10, Alexei Starovoitov wrote: > >>>> On 7/10/15 3:03 AM, He Kuang wrote: > >>>> There're scenarios that we need an eBPF program to record not only > >>>> kprobe point args, but also the PMU counters, time latencies or the > >>>> number of cache misses between two probe points and other information > >>>> when the probe point is entered. > >>>> > >>>> This patch adds a new trace event to establish infrastruction for bpf to > >>>> output data to perf. Userspace perf tools can detect and use this event > >>>> as using the existing tracepoint events. > >>>> > >>>> New bpf trace event entry in debugfs: > >>>> > >>>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data > >>>> > >>>> Userspace perf tools detect the new tracepoint event as: > >>>> > >>>> bpf:bpf_output_data [Tracepoint event] > >>> > >>> Nice! This approach looks cleanest so far. > >>> > >>>> +TRACE_EVENT(bpf_output_data, > >>>> + > >>>> + TP_PROTO(u64 *src, int len), > >>>> + > >>>> + TP_ARGS(src, len), > >>>> + > >>>> + TP_STRUCT__entry( > >>>> + __dynamic_array(u64, buf, len) > >>>> + ), > >>>> + > >>>> + TP_fast_assign( > >>>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64)); > >>> > >>> may be make it 'u8' array? The extra multiply and... > >> > >> OK > >> > >> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2, > >> 0x623eb6d) will be this: > >> > >> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00 > >> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00 > >> > >> And users are not restricted to u64 type elements. I'll change that. > > > > While this general event format works well, I think it might be hard > > to know which output came from which program when more than one bpf > > programs used. > > > > I was thinking about providing custom event formats for each bpf > > program (if needed). The event format definitions might be in a > > specific directory or a bpf object itself. Then perf can read those > > formats and print the output data according to the formats. Maybe we > > need to add some dynamic event id to match format and data. > > > > I think we can do it in perf side. Let BPF programs themselves > encode format information into the array and make perf read and > decode them. In kernel side simply support raw data should be > enough, so we can make kernel code as simple as possible. Yes, of course, I also meant that doing those work all in perf side. :) Thanks, Namhyung