From: "Wangnan (F)" <wangnan0@huawei.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Cc: "Alexei Starovoitov" <ast@kernel.org>,
"Arnaldo Carvalho de Melo" <acme@redhat.com>,
"Brendan Gregg" <brendan.d.gregg@gmail.com>,
"Adrian Hunter" <adrian.hunter@intel.com>,
"Cody P Schafer" <dev@codyps.com>,
"David S. Miller" <davem@davemloft.net>,
"He Kuang" <hekuang@huawei.com>,
"Jérémie Galarneau" <jeremie.galarneau@efficios.com>,
"Jiri Olsa" <jolsa@kernel.org>,
"Kirill Smelkov" <kirr@nexedi.com>,
"Li Zefan" <lizefan@huawei.com>,
"Masami Hiramatsu" <masami.hiramatsu.pt@hitachi.com>,
"Namhyung Kim" <namhyung@kernel.org>,
pi3orama@163.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 10/48] perf tools: Introduce bpf-output event
Date: Wed, 24 Feb 2016 12:03:36 +0800 [thread overview]
Message-ID: <56CD2B98.4040009@huawei.com> (raw)
In-Reply-To: <56CD0FC4.3070305@huawei.com>
On 2016/2/24 10:04, Wangnan (F) wrote:
>
>
> On 2016/2/24 9:58, Wangnan (F) wrote:
>>
>>
>> On 2016/2/24 1:45, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Feb 22, 2016 at 09:10:37AM +0000, Wang Nan escreveu:
>>>> Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
>>>> bpf_perf_event_output() helper) add a helper to enable BPF program
>>>> output data to perf ring buffer through a new type of perf event
>>>> PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
>>>> event of that type. Now perf user can use following cmdline to
>>>> receive output data from BPF programs:
>>>>
>>>> # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
>>>> -e ./test_bpf_output.c/map:channel.event=evt/
>>>> ls /
>>>> # ./perf script
>>>> perf 1560 [004] 347747.086295:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> perf 1560 [004] 347747.086300:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> perf 1560 [004] 347747.086315:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> ...
>>>>
>>>> Test result:
>>>> # cat ./test_bpf_output.c
>>>> /************************ BEGIN **************************/
>>>> #include <uapi/linux/bpf.h>
>>>> struct bpf_map_def {
>>>> unsigned int type;
>>>> unsigned int key_size;
>>>> unsigned int value_size;
>>>> unsigned int max_entries;
>>>> };
>>>>
>>>> #define SEC(NAME) __attribute__((section(NAME), used))
>>>> static u64 (*ktime_get_ns)(void) =
>>>> (void *)BPF_FUNC_ktime_get_ns;
>>>> static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
>>>> (void *)BPF_FUNC_trace_printk;
>>>> static int (*get_smp_processor_id)(void) =
>>>> (void *)BPF_FUNC_get_smp_processor_id;
>>>> static int (*perf_event_output)(void *, struct bpf_map_def *,
>>>> int, void *, unsigned long) =
>>>> (void *)BPF_FUNC_perf_event_output;
>>>>
>>>> struct bpf_map_def SEC("maps") channel = {
>>>> .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
>>>> .key_size = sizeof(int),
>>>> .value_size = sizeof(u32),
>>>> .max_entries = __NR_CPUS__,
>>>> };
>>>>
>>>> SEC("func_write=sys_write")
>>>> int func_write(void *ctx)
>>>> {
>>>> struct {
>>>> u64 ktime;
>>>> int cpuid;
>>>> } __attribute__((packed)) output_data;
>>>> char error_data[] = "Error: failed to output: %d\n";
>>>>
>>>> output_data.cpuid = get_smp_processor_id();
>>>> output_data.ktime = ktime_get_ns(); supr
>>>> int err = perf_event_output(ctx, &channel,
>>>> get_smp_processor_id(),
>>>> &output_data, sizeof(output_data));
>>>> if (err)
>>>> trace_printk(error_data, sizeof(error_data), err);
>>>> return 0;
>>>> }
>>>> char _license[] SEC("license") = "GPL";
>>>> int _version SEC("version") = LINUX_VERSION_CODE;
>>>> /************************ END ***************************/
>>>>
>>>> # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
>>>> -e ./test_bpf_output.c/map:channel.event=evt/
>>>> ls /
>>>> # ./perf script | grep ls
>>>> ls 2242 [003] 347851.557563: evt: ffffffff811fd201
>>>> sys_write ...
>>>> ls 2242 [003] 347851.557571: evt: ffffffff811fd201
>>>> sys_write ...
>>> So, there is something strange here:
>>>
>>> if (unlikely(event->oncpu != smp_processor_id()))
>>> return -EOPNOTSUPP;
>>>
>>
>
> All failures have 'event->oncpu == -1' here. I guess we should
> suppress warning in
> this case. But why event->oncpu becomes -1?
>
For this specific test it is not surprising to see these error messages.
In this test
we create bpf-output channel on 'ls' process only, but the BPF script is
triggered
on all procs (BPF triggering is not related to perf event scheduling).
Trying to
output data through 'ls' specific bpf-output channel should fail if this
'sys_write'
is not issued by 'ls' or its children. So it is a correct behavior.
However, I also see them in system wide channel:
# echo "" > /sys/kernel/debug/tracing/trace
# ./perf record -a -e bpf-output/no-inherit,name=evt/ \
-e ./test_bpf_output.c/map:channel.event=evt/
-a
^C[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 17.534 MB perf.data (264326 samples) ]
# cat /sys/kernel/debug/tracing/trace | tail
rs:main Q:Reg-582 [000] d..2 4858.711225: : Error: failed to
output: -95
rs:main Q:Reg-582 [000] d..2 4858.711241: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711436: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711441: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711473: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712215: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712224: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.712230: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712235: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712239: : Error: failed to
output: -95
System wide events can also be scheduled in and out. If the bpf-output
events
are scheduled out, trying to output data through it causes the above
failure.
I don't think it is a problem.
Peter, Could you please give some infomation? In which case a system wide
bpf output channel would be scheduled out?
Thank you.
next prev parent reply other threads:[~2016-02-24 4:09 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-22 9:10 [PATCH 00/48] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
2016-02-22 9:10 ` [PATCH 01/48] perf tools: Record text offset in dso to calculate objdump address Wang Nan
2016-02-22 9:10 ` [PATCH 02/48] perf tools: Adjust symbol for shared objects Wang Nan
2016-02-22 9:10 ` [PATCH 03/48] perf bpf: Add API to set values to map entries in a bpf object Wang Nan
2016-02-25 5:39 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 04/48] perf tools: Enable BPF object configure syntax Wang Nan
2016-02-25 5:39 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 05/48] perf record: Apply config to BPF objects before recording Wang Nan
2016-02-25 5:39 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 06/48] perf tools: Enable passing event to BPF object Wang Nan
2016-02-25 5:40 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 07/48] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-02-25 5:40 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 08/48] perf tools: Enable indices setting syntax for BPF map Wang Nan
2016-02-25 5:40 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 09/48] perf tools: Pass tracepoint options to BPF script Wang Nan
2016-02-25 5:41 ` [tip:perf/core] perf tools: Apply tracepoint event definition " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 10/48] perf tools: Introduce bpf-output event Wang Nan
2016-02-23 17:45 ` Arnaldo Carvalho de Melo
2016-02-24 1:58 ` Wangnan (F)
2016-02-24 2:04 ` Wangnan (F)
2016-02-24 4:03 ` Wangnan (F) [this message]
2016-02-24 5:03 ` Wangnan (F)
2016-02-24 13:36 ` Arnaldo Carvalho de Melo
2016-02-25 5:41 ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-22 9:10 ` [PATCH 11/48] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-02-23 16:14 ` Arnaldo Carvalho de Melo
2016-02-23 17:23 ` Jiri Olsa
2016-02-23 17:24 ` Jiri Olsa
2016-02-23 19:22 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 12/48] perf data: Explicitly set byte order for integer types Wang Nan
2016-02-22 9:10 ` [PATCH 13/48] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
2016-02-22 9:10 ` [PATCH 14/48] perf core: Set event's default overflow_handler Wang Nan
2016-02-22 9:10 ` [PATCH 15/48] perf core: Prepare writing into ring buffer from end Wang Nan
2016-02-22 9:10 ` [PATCH 16/48] perf core: Add backward attribute to perf event Wang Nan
2016-02-24 13:08 ` Jiri Olsa
2016-02-24 13:21 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 17/48] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
2016-02-22 9:10 ` [PATCH 18/48] perf tools: Only validate is_pos for tracking evsels Wang Nan
2016-02-24 14:21 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 19/48] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
2016-02-22 9:10 ` [PATCH 20/48] perf tools: Make ordered_events reusable Wang Nan
2016-02-24 14:18 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 21/48] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-02-24 14:29 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 22/48] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-02-24 14:34 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 23/48] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-02-24 14:43 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 24/48] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-02-22 9:10 ` [PATCH 25/48] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-02-22 9:10 ` [PATCH 26/48] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-02-22 9:10 ` [PATCH 27/48] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-02-22 9:10 ` [PATCH 28/48] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-02-22 9:10 ` [PATCH 29/48] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-02-24 14:57 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 30/48] perf record: Generate tracking events for process forked by perf Wang Nan
2016-02-24 15:01 ` Jiri Olsa
2016-02-22 9:10 ` [PATCH 31/48] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-02-22 9:10 ` [PATCH 32/48] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-02-22 9:11 ` [PATCH 33/48] perf tools: Add evlist channel helpers Wang Nan
2016-02-22 9:11 ` [PATCH 34/48] perf tools: Automatically add new channel according to evlist Wang Nan
2016-02-22 9:11 ` [PATCH 35/48] perf tools: Operate multiple channels Wang Nan
2016-02-22 9:11 ` [PATCH 36/48] perf tools: Squash overwrite setting into channel Wang Nan
2016-02-22 9:11 ` [PATCH 37/48] perf record: Don't read from and poll overwrite channel Wang Nan
2016-02-22 9:11 ` [PATCH 38/48] perf record: Don't poll on " Wang Nan
2016-02-22 9:11 ` [PATCH 39/48] perf tools: Detect avalibility of write_backward Wang Nan
2016-02-22 9:11 ` [PATCH 40/48] perf tools: Enable overwrite settings Wang Nan
2016-02-22 9:11 ` [PATCH 41/48] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
2016-02-22 9:11 ` [PATCH 42/48] perf tools: Record fd into perf_mmap Wang Nan
2016-02-22 9:11 ` [PATCH 43/48] perf tools: Add API to pause a channel Wang Nan
2016-02-22 9:11 ` [PATCH 44/48] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-02-22 9:11 ` [PATCH 45/48] perf record: Rename variable to make code clear Wang Nan
2016-02-22 9:11 ` [PATCH 46/48] perf record: Read from backward ring buffer Wang Nan
2016-02-22 9:11 ` [PATCH 47/48] perf record: Allow generate tracking events at the end of output Wang Nan
2016-02-22 9:11 ` [PATCH 48/48] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CD2B98.4040009@huawei.com \
--to=wangnan0@huawei.com \
--cc=acme@kernel.org \
--cc=acme@redhat.com \
--cc=adrian.hunter@intel.com \
--cc=ast@kernel.org \
--cc=brendan.d.gregg@gmail.com \
--cc=davem@davemloft.net \
--cc=dev@codyps.com \
--cc=hekuang@huawei.com \
--cc=jeremie.galarneau@efficios.com \
--cc=jolsa@kernel.org \
--cc=kirr@nexedi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=pi3orama@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox