From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754506AbcGHKqe (ORCPT ); Fri, 8 Jul 2016 06:46:34 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:54690 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754112AbcGHKq0 (ORCPT ); Fri, 8 Jul 2016 06:46:26 -0400 Subject: Re: perf bpf examples To: Brendan Gregg References: <577DB50D.3040204@huawei.com> <577F297E.6030405@huawei.com> CC: "linux-perf-use." , "linux-kernel@vger.kernel.org" , "Arnaldo Carvalho de Melo" , Alexei Starovoitov From: "Wangnan (F)" Message-ID: <577F8473.1090208@huawei.com> Date: Fri, 8 Jul 2016 18:46:11 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.577F847E.01CC,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 0f014a84c0d1c428c419f1e48c391c89 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016/7/8 15:57, Brendan Gregg wrote: > On Thu, Jul 7, 2016 at 9:18 PM, Wangnan (F) wrote: >> >> On 2016/7/8 1:58, Brendan Gregg wrote: >>> On Thu, Jul 7, 2016 at 10:54 AM, Brendan Gregg >>> wrote: >>>> On Wed, Jul 6, 2016 at 6:49 PM, Wangnan (F) wrote: > [...] >>> ... Also, has anyone looked into perf sampling (-F 99) with bpf yet? >>> Thanks, >> >> Theoretically, BPF program is an additional filter to >> decide whetier an event should be filtered out or pass to perf. -F 99 >> is another filter, which drops samples to ensure the frequence. >> Filters works together. The full graph should be: >> >> BPF --> traditional filter --> proc (system wide of proc specific) --> >> period >> >> See the example at the end of this mail. The BPF program returns 0 for half >> of >> the events, and the result should be symmetrical. We can get similar result >> without >> -F: >> >> # ~/perf record -a --clang-opt '-DCATCH_ODD' -e ./sampling.c dd if=/dev/zero >> of=/dev/null count=8388480 >> 8388480+0 records in >> 8388480+0 records out >> 4294901760 bytes (4.3 GB) copied, 11.9908 s, 358 MB/s >> [ perf record: Woken up 28 times to write data ] >> [ perf record: Captured and wrote 303.915 MB perf.data (4194449 samples) ] >> # >> root@wn-Lenovo-Product:~# ~/perf record -a --clang-opt '-DCATCH_EVEN' -e >> ./sampling.c dd if=/dev/zero of=/dev/null count=8388480 >> 8388480+0 records in >> 8388480+0 records out >> 4294901760 bytes (4.3 GB) copied, 12.1154 s, 355 MB/s >> [ perf record: Woken up 54 times to write data ] >> [ perf record: Captured and wrote 303.933 MB perf.data (4194347 samples) ] >> >> >> With -F99 added: >> >> # ~/perf record -F99 -a --clang-opt '-DCATCH_ODD' -e ./sampling.c dd >> if=/dev/zero of=/dev/null count=8388480 >> 8388480+0 records in >> 8388480+0 records out >> 4294901760 bytes (4.3 GB) copied, 9.60126 s, 447 MB/s >> [ perf record: Woken up 1 times to write data ] >> [ perf record: Captured and wrote 0.402 MB perf.data (35 samples) ] >> # ~/perf record -F99 -a --clang-opt '-DCATCH_EVEN' -e ./sampling.c dd >> if=/dev/zero of=/dev/null count=8388480 >> 8388480+0 records in >> 8388480+0 records out >> 4294901760 bytes (4.3 GB) copied, 9.76719 s, 440 MB/s >> [ perf record: Woken up 1 times to write data ] >> [ perf record: Captured and wrote 0.399 MB perf.data (37 samples) ] > That looks like it's doing two different things: -F99, and a > sampling.c script (SEC("func=sys_read")). > > I mean just an -F99 that executes a BPF program on each sample. My > most common use for perf is: > > perf record -F 99 -a -g -- sleep 30 > perf report (or perf script, for making flame graphs) > > But this uses perf.data as an intermediate file. With the recent > BPF_MAP_TYPE_STACK_TRACE, we could frequency count stack traces in > kernel context, and just dump a report. Much more efficient. And > improving a very common perf one-liner. You can't attach BPF script to samples other than kprobe and tracepoints. When you use 'perf record -F99 -a -g -- sleep 30', you are sampling on 'cycles:ppp' event. This is a hardware PMU event. If we find a kprobe or tracepoint event which would be triggered 99 times in each second, we can utilize BPF_MAP_TYPE_STACK_TRACE and bpf_get_stackid(). Thank you.