From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yonghong Song Subject: Re: [PATCH bpf-next 2/7] bpf: introduce bpf subcommand BPF_PERF_EVENT_QUERY Date: Thu, 17 May 2018 10:50:38 -0700 Message-ID: References: <20180515234521.856763-1-yhs@fb.com> <20180515234521.856763-3-yhs@fb.com> <20180516112734.GE12217@hirez.programming.kicks-ass.net> <63ef8aa3-1e01-e0cb-1f81-d41d87db0565@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: , , To: Daniel Borkmann , Peter Zijlstra Return-path: Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:57722 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751642AbeEQRv2 (ORCPT ); Thu, 17 May 2018 13:51:28 -0400 In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 5/17/18 8:32 AM, Daniel Borkmann wrote: > On 05/16/2018 11:59 PM, Yonghong Song wrote: >> On 5/16/18 4:27 AM, Peter Zijlstra wrote: >>> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote: >>>> Currently, suppose a userspace application has loaded a bpf program >>>> and attached it to a tracepoint/kprobe/uprobe, and a bpf >>>> introspection tool, e.g., bpftool, wants to show which bpf program >>>> is attached to which tracepoint/kprobe/uprobe. Such attachment >>>> information will be really useful to understand the overall bpf >>>> deployment in the system. >>>> >>>> There is a name field (16 bytes) for each program, which could >>>> be used to encode the attachment point. There are some drawbacks >>>> for this approaches. First, bpftool user (e.g., an admin) may not >>>> really understand the association between the name and the >>>> attachment point. Second, if one program is attached to multiple >>>> places, encoding a proper name which can imply all these >>>> attachments becomes difficult. >>>> >>>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY. >>>> Given a pid and fd, if the is associated with a >>>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return >>>>     . prog_id >>>>     . tracepoint name, or >>>>     . k[ret]probe funcname + offset or kernel addr, or >>>>     . u[ret]probe filename + offset >>>> to the userspace. >>>> The user can use "bpftool prog" to find more information about >>>> bpf program itself with prog_id. >>>> >>>> Signed-off-by: Yonghong Song >>>> --- >>>>   include/linux/trace_events.h |  15 ++++++ >>>>   include/uapi/linux/bpf.h     |  25 ++++++++++ >>>>   kernel/bpf/syscall.c         | 113 +++++++++++++++++++++++++++++++++++++++++++ >>>>   kernel/trace/bpf_trace.c     |  53 ++++++++++++++++++++ >>>>   kernel/trace/trace_kprobe.c  |  29 +++++++++++ >>>>   kernel/trace/trace_uprobe.c  |  22 +++++++++ >>>>   6 files changed, 257 insertions(+) >>> >>> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf >>> places to attach BPF proglets? >> >> Just gave a complete picture, the below are major places to attach >> BPF programs: >>    . perf based (through perf ioctl) >>    . raw tracepoint based (through bpf interface) >> >>    . netlink interface for tc, xdp, tunneling >>    . setsockopt for socket filters >>    . cgroup based (bpf attachment subcommand) >>      mostly networking and io devices >>    . some other networking socket related (sk_skb stream/parser/verdict, >>      sk_msg verdict) through bpf attachment subcommand. >> >> Currently, for cgroup based attachment, we have BPF_PROG_QUERY with input cgroup file descriptor. For other networking based queries, we >> may need to enumerate tc filters, networking devices, open sockets, etc. >> to get the attachment information. >> >> So to have one BPF_QUERY command line may be too complex to >> cover all cases. >> >> But you are right that BPF_PERF_EVENT_QUERY name is too narrow since >> it should be used for other (pid, fd) based queries as well (e.g., socket, or other potential uses in the future). >> >> How about the subcommand name BPF_TASK_FD_QUERY and make bpf_attr.task_fd_query extensible? > > I like the introspection output it provides in 7/7, it's really great! > So the query interface would only ever be tied to BPF progs whose attach > life time is tied to the life time of the application and as soon as all > refs on the fd are released it's unloaded from the system. BPF_TASK_FD_QUERY > seems okay to me, or something like BPF_ATTACH_QUERY. Even if the name is > slightly more generic, it might be more fitting with other cmds like > BPF_PROG_QUERY we have where we tell an attach point to retrieve all progs > from it (though only tied to cgroups right now, it may not be in future). I think BPF_TASK_FD_QUERY is okay. Using BPF_ATTACH_QUERY indeed seems a little bit broader to me as other query subcommands are possible to query attachments with different input. BPF_PROG_QUERY is also trying to query attachment. Currently, given a cgroup fd, it will query prog array attached. Sean has the patch to attach bpf programs to a RC device, and given a device fd, it will query prog array attached to that device. > > For all the others that are not strictly tied to the task but global, bpftool > would then need to be extended to query the various other interfaces like > netlink for retrieval which is on todo for some point in future as well. So > this set nicely complements this introspection aspect. Totally agree. Thanks! > > Thanks, > Daniel >