From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751139AbbGMEhK (ORCPT ); Mon, 13 Jul 2015 00:37:10 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:27455 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750777AbbGMEhI (ORCPT ); Mon, 13 Jul 2015 00:37:08 -0400 Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event To: Alexei Starovoitov , , , , , , , References: <1436522587-136825-1-git-send-email-hekuang@huawei.com> <1436522587-136825-4-git-send-email-hekuang@huawei.com> <55A042DC.6030809@plumgrid.com> CC: , , From: He Kuang Message-ID: <55A3404B.6020904@huawei.com> Date: Mon, 13 Jul 2015 12:36:27 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.0 MIME-Version: 1.0 In-Reply-To: <55A042DC.6030809@plumgrid.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.110.54.65] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org hi, Alexei On 2015/7/11 6:10, Alexei Starovoitov wrote: > On 7/10/15 3:03 AM, He Kuang wrote: >> There're scenarios that we need an eBPF program to record not only >> kprobe point args, but also the PMU counters, time latencies or the >> number of cache misses between two probe points and other information >> when the probe point is entered. >> >> This patch adds a new trace event to establish infrastruction for bpf to >> output data to perf. Userspace perf tools can detect and use this event >> as using the existing tracepoint events. >> >> New bpf trace event entry in debugfs: >> >> /sys/kernel/debug/tracing/events/bpf/bpf_output_data >> >> Userspace perf tools detect the new tracepoint event as: >> >> bpf:bpf_output_data [Tracepoint event] > > Nice! This approach looks cleanest so far. > >> +TRACE_EVENT(bpf_output_data, >> + >> + TP_PROTO(u64 *src, int len), >> + >> + TP_ARGS(src, len), >> + >> + TP_STRUCT__entry( >> + __dynamic_array(u64, buf, len) >> + ), >> + >> + TP_fast_assign( >> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64)); > > may be make it 'u8' array? The extra multiply and... OK So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2, 0x623eb6d) will be this: dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00 f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00 And users are not restricted to u64 type elements. I'll change that. > >> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5) >> +{ >> + void *src = (void *) (long) r1; >> + int size = (int) r2; >> + >> + trace_bpf_output_data(src, size / sizeof(u64)); > > .. and this silent round down could be confusing to use. > With array of u8, the program can push any structured data into it > and let user space interpret it. > >