From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752735AbbGMOL7 (ORCPT <rfc822;w@1wt.eu>);
	Mon, 13 Jul 2015 10:11:59 -0400
Received: from mail-pa0-f54.google.com ([209.85.220.54]:33716 "EHLO
	mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752632AbbGMOL3 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 13 Jul 2015 10:11:29 -0400
Date: Mon, 13 Jul 2015 23:09:15 +0900
From: Namhyung Kim <namhyung@kernel.org>
To: pi3orama <pi3orama@163.com>
Cc: He Kuang <hekuang@huawei.com>, Alexei Starovoitov <ast@plumgrid.com>,
        "rostedt@goodmis.org" <rostedt@goodmis.org>,
        "masami.hiramatsu.pt@hitachi.com" <masami.hiramatsu.pt@hitachi.com>,
        "acme@kernel.org" <acme@kernel.org>,
        "a.p.zijlstra@chello.nl" <a.p.zijlstra@chello.nl>,
        "mingo@redhat.com" <mingo@redhat.com>,
        "jolsa@kernel.org" <jolsa@kernel.org>,
        "wangnan0@huawei.com" <wangnan0@huawei.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to
 perf event
Message-ID: <20150713140915.GD9917@danjae.kornet>
References: <1436522587-136825-1-git-send-email-hekuang@huawei.com>
 <1436522587-136825-4-git-send-email-hekuang@huawei.com>
 <55A042DC.6030809@plumgrid.com>
 <55A3404B.6020904@huawei.com>
 <20150713135223.GB9917@danjae.kornet>
 <4D441676-21A7-46EE-AAB0-EB529D408082@163.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4D441676-21A7-46EE-AAB0-EB529D408082@163.com>
User-Agent: Mutt/1.5.23+89 (0255b37be491) (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
> 
> 
> 发自我的 iPhone
> 
> > 在 2015年7月13日，下午9:52，Namhyung Kim <namhyung@kernel.org> 写道：
> > 
> > Hi,
> > 
> >> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> >> hi, Alexei
> >> 
> >>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >>>> On 7/10/15 3:03 AM, He Kuang wrote:
> >>>> There're scenarios that we need an eBPF program to record not only
> >>>> kprobe point args, but also the PMU counters, time latencies or the
> >>>> number of cache misses between two probe points and other information
> >>>> when the probe point is entered.
> >>>> 
> >>>> This patch adds a new trace event to establish infrastruction for bpf to
> >>>> output data to perf. Userspace perf tools can detect and use this event
> >>>> as using the existing tracepoint events.
> >>>> 
> >>>> New bpf trace event entry in debugfs:
> >>>> 
> >>>>     /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>>> 
> >>>> Userspace perf tools detect the new tracepoint event as:
> >>>> 
> >>>>     bpf:bpf_output_data                          [Tracepoint event]
> >>> 
> >>> Nice! This approach looks cleanest so far.
> >>> 
> >>>> +TRACE_EVENT(bpf_output_data,
> >>>> +
> >>>> +    TP_PROTO(u64 *src, int len),
> >>>> +
> >>>> +    TP_ARGS(src, len),
> >>>> +
> >>>> +    TP_STRUCT__entry(
> >>>> +        __dynamic_array(u64,        buf,        len)
> >>>> +    ),
> >>>> +
> >>>> +    TP_fast_assign(
> >>>> +        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >>> 
> >>> may be make it 'u8' array? The extra multiply and...
> >> 
> >> OK
> >> 
> >> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> >> 0x623eb6d) will be this:
> >> 
> >>  dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> >>  f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
> >> 
> >> And users are not restricted to u64 type elements. I'll change that.
> > 
> > While this general event format works well, I think it might be hard
> > to know which output came from which program when more than one bpf
> > programs used.
> > 
> > I was thinking about providing custom event formats for each bpf
> > program (if needed).  The event format definitions might be in a
> > specific directory or a bpf object itself.  Then perf can read those
> > formats and print the output data according to the formats.  Maybe we
> > need to add some dynamic event id to match format and data.
> > 
>
> I think we can do it in perf side. Let BPF programs themselves
> encode format information into the array and make perf read and
> decode them. In kernel side simply support raw data should be
> enough, so we can make kernel code as simple as possible.

Yes, of course, I also meant that doing those work all in perf side. :)

Thanks,
Namhyung