netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: Kris Van Hees <kris.van.hees@oracle.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
	dtrace-devel@oss.oracle.com, linux-kernel@vger.kernel.org,
	rostedt@goodmis.org, mhiramat@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, Peter Zijlstra <peterz@infradead.org>,
	Chris Mason <clm@fb.com>
Subject: Re: [PATCH 1/1] tools/dtrace: initial implementation of DTrace
Date: Mon, 8 Jul 2019 14:15:37 -0300	[thread overview]
Message-ID: <20190708171537.GA11960@kernel.org> (raw)
In-Reply-To: <201907040314.x643EUoA017906@aserv0122.oracle.com>

Em Wed, Jul 03, 2019 at 08:14:30PM -0700, Kris Van Hees escreveu:
> This initial implementation of a tiny subset of DTrace functionality
> provides the following options:
> 
> 	dtrace [-lvV] [-b bufsz] -s script
> 	    -b  set trace buffer size
> 	    -l  list probes (only works with '-s script' for now)
> 	    -s  enable or list probes for the specified BPF program
> 	    -V  report DTrace API version
> 
> The patch comprises quite a bit of code due to DTrace requiring a few
> crucial components, even in its most basic form.
> 
> The code is structured around the command line interface implemented in
> dtrace.c.  It provides option parsing and drives the three modes of
> operation that are currently implemented:
> 
> 1. Report DTrace API version information.
> 	Report the version information and terminate.
> 
> 2. List probes in BPF programs.
> 	Initialize the list of probes that DTrace recognizes, load BPF
> 	programs, parse all BPF ELF section names, resolve them into
> 	known probes, and emit the probe names.  Then terminate.
> 
> 3. Load BPF programs and collect tracing data.
> 	Initialize the list of probes that DTrace recognizes, load BPF
> 	programs and attach them to their corresponding probes, set up
> 	perf event output buffers, and start processing tracing data.
> 
> This implementation makes extensive use of BPF (handled by dt_bpf.c) and
> the perf event output ring buffer (handled by dt_buffer.c).  DTrace-style
> probe handling (dt_probe.c) offers an interface to probes that hides the
> implementation details of the individual probe types by provider (dt_fbt.c
> and dt_syscall.c).  Probe lookup by name uses a hashtable implementation
> (dt_hash.c).  The dt_utils.c code populates a list of online CPU ids, so
> we know what CPUs we can obtain tracing data from.
> 
> Building the tool is trivial because its only dependency (libbpf) is in
> the kernel tree under tools/lib/bpf.  A simple 'make' in the tools/dtrace
> directory suffices.
> 
> The 'dtrace' executable needs to run as root because BPF programs cannot
> be loaded by non-root users.
> 
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> Reviewed-by: David Mc Lean <david.mclean@oracle.com>
> Reviewed-by: Eugene Loh <eugene.loh@oracle.com>
> ---
>  MAINTAINERS                |   6 +
>  tools/dtrace/Makefile      |  88 ++++++++++
>  tools/dtrace/bpf_sample.c  | 145 ++++++++++++++++
>  tools/dtrace/dt_bpf.c      | 188 +++++++++++++++++++++
>  tools/dtrace/dt_buffer.c   | 331 +++++++++++++++++++++++++++++++++++++
>  tools/dtrace/dt_fbt.c      | 201 ++++++++++++++++++++++
>  tools/dtrace/dt_hash.c     | 211 +++++++++++++++++++++++
>  tools/dtrace/dt_probe.c    | 230 ++++++++++++++++++++++++++
>  tools/dtrace/dt_syscall.c  | 179 ++++++++++++++++++++
>  tools/dtrace/dt_utils.c    | 132 +++++++++++++++
>  tools/dtrace/dtrace.c      | 249 ++++++++++++++++++++++++++++
>  tools/dtrace/dtrace.h      |  13 ++
>  tools/dtrace/dtrace_impl.h | 101 +++++++++++
>  13 files changed, 2074 insertions(+)
>  create mode 100644 tools/dtrace/Makefile
>  create mode 100644 tools/dtrace/bpf_sample.c
>  create mode 100644 tools/dtrace/dt_bpf.c
>  create mode 100644 tools/dtrace/dt_buffer.c
>  create mode 100644 tools/dtrace/dt_fbt.c
>  create mode 100644 tools/dtrace/dt_hash.c
>  create mode 100644 tools/dtrace/dt_probe.c
>  create mode 100644 tools/dtrace/dt_syscall.c
>  create mode 100644 tools/dtrace/dt_utils.c
>  create mode 100644 tools/dtrace/dtrace.c
>  create mode 100644 tools/dtrace/dtrace.h
>  create mode 100644 tools/dtrace/dtrace_impl.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 606d1f80bc49..668468834865 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5474,6 +5474,12 @@ W:	https://linuxtv.org
>  S:	Odd Fixes
>  F:	drivers/media/pci/dt3155/
>  
> +DTRACE
> +M:	Kris Van Hees <kris.van.hees@oracle.com>
> +L:	dtrace-devel@oss.oracle.com
> +S:	Maintained
> +F:	tools/dtrace/
> +
>  DVB_USB_AF9015 MEDIA DRIVER
>  M:	Antti Palosaari <crope@iki.fi>
>  L:	linux-media@vger.kernel.org
> diff --git a/tools/dtrace/Makefile b/tools/dtrace/Makefile
> new file mode 100644
> index 000000000000..99fd0f9dd1d6
> --- /dev/null
> +++ b/tools/dtrace/Makefile
> @@ -0,0 +1,88 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# This Makefile is based on samples/bpf.
> +#
> +# Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved.
> +
> +DT_VERSION		:= 2.0.0
> +DT_GIT_VERSION		:= $(shell git rev-parse HEAD 2>/dev/null || \
> +				   echo Unknown)
> +
> +DTRACE_PATH		?= $(abspath $(srctree)/$(src))
> +TOOLS_PATH		:= $(DTRACE_PATH)/..
> +SAMPLES_PATH		:= $(DTRACE_PATH)/../../samples
> +
> +hostprogs-y		:= dtrace
> +
> +LIBBPF			:= $(TOOLS_PATH)/lib/bpf/libbpf.a
> +OBJS			:= dt_bpf.o dt_buffer.o dt_utils.o dt_probe.o \
> +			   dt_hash.o \
> +			   dt_fbt.o dt_syscall.o
> +
> +dtrace-objs		:= $(OBJS) dtrace.o
> +
> +always			:= $(hostprogs-y)
> +always			+= bpf_sample.o
> +
> +KBUILD_HOSTCFLAGS	+= -DDT_VERSION=\"$(DT_VERSION)\"
> +KBUILD_HOSTCFLAGS	+= -DDT_GIT_VERSION=\"$(DT_GIT_VERSION)\"
> +KBUILD_HOSTCFLAGS	+= -I$(srctree)/tools/lib
> +KBUILD_HOSTCFLAGS	+= -I$(srctree)/tools/perf

Interesting, what are you using from tools/perf/? So that we can move to
tools/{include,lib,arch}.

> +KBUILD_HOSTCFLAGS	+= -I$(srctree)/tools/include/uapi
> +KBUILD_HOSTCFLAGS	+= -I$(srctree)/tools/include/
> +KBUILD_HOSTCFLAGS	+= -I$(srctree)/usr/include
> +
> +KBUILD_HOSTLDLIBS	:= $(LIBBPF) -lelf
> +
> +LLC			?= llc
> +CLANG			?= clang
> +LLVM_OBJCOPY		?= llvm-objcopy
> +
> +ifdef CROSS_COMPILE
> +HOSTCC			= $(CROSS_COMPILE)gcc
> +CLANG_ARCH_ARGS		= -target $(ARCH)
> +endif
> +
> +all:
> +	$(MAKE) -C ../../ $(CURDIR)/ DTRACE_PATH=$(CURDIR)
> +
> +clean:
> +	$(MAKE) -C ../../ M=$(CURDIR) clean
> +	@rm -f *~
> +
> +$(LIBBPF): FORCE
> +	$(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(DTRACE_PATH)/../../ O=
> +
> +FORCE:
> +
> +.PHONY: verify_cmds verify_target_bpf $(CLANG) $(LLC)
> +
> +verify_cmds: $(CLANG) $(LLC)
> +	@for TOOL in $^ ; do \
> +		if ! (which -- "$${TOOL}" > /dev/null 2>&1); then \
> +			echo "*** ERROR: Cannot find LLVM tool $${TOOL}" ;\
> +			exit 1; \
> +		else true; fi; \
> +	done
> +
> +verify_target_bpf: verify_cmds
> +	@if ! (${LLC} -march=bpf -mattr=help > /dev/null 2>&1); then \
> +		echo "*** ERROR: LLVM (${LLC}) does not support 'bpf' target" ;\
> +		echo "   NOTICE: LLVM version >= 3.7.1 required" ;\
> +		exit 2; \
> +	else true; fi
> +
> +$(DTRACE_PATH)/*.c: verify_target_bpf $(LIBBPF)
> +$(src)/*.c: verify_target_bpf $(LIBBPF)
> +
> +$(obj)/%.o: $(src)/%.c
> +	@echo "  CLANG-bpf " $@
> +	$(Q)$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
> +		-I$(srctree)/tools/testing/selftests/bpf/ \
> +		-D__KERNEL__ -D__BPF_TRACING__ -Wno-unused-value -Wno-pointer-sign \
> +		-D__TARGET_ARCH_$(ARCH) -Wno-compare-distinct-pointer-types \
> +		-Wno-gnu-variable-sized-type-not-at-end \
> +		-Wno-address-of-packed-member -Wno-tautological-compare \
> +		-Wno-unknown-warning-option $(CLANG_ARCH_ARGS) \
> +		-I$(srctree)/samples/bpf/ -include asm_goto_workaround.h \
> +		-O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf $(LLC_FLAGS) -filetype=obj -o $@


We have the above in tools/perf/util/llvm-utils.c, perhaps we need to
move it to some place in lib/ to share?

> diff --git a/tools/dtrace/bpf_sample.c b/tools/dtrace/bpf_sample.c
> new file mode 100644
> index 000000000000..49f350390b5f
> --- /dev/null
> +++ b/tools/dtrace/bpf_sample.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This sample DTrace BPF tracing program demonstrates how actions can be
> + * associated with different probe types.
> + *
> + * The kprobe/ksys_write probe is a Function Boundary Tracing (FBT) entry probe
> + * on the ksys_write(fd, buf, count) function in the kernel.  Arguments to the
> + * function can be retrieved from the CPU registers (struct pt_regs).
> + *
> + * The tracepoint/syscalls/sys_enter_write probe is a System Call entry probe
> + * for the write(d, buf, count) system call.  Arguments to the system call can
> + * be retrieved from the tracepoint data passed to the BPF program as context
> + * struct syscall_data) when the probe fires.
> + *
> + * The BPF program associated with each probe prepares a DTrace BPF context
> + * (struct dt_bpf_context) that stores the probe ID and up to 10 arguments.
> + * Only 3 arguments are used in this sample.  Then the prorgams call a shared
> + * BPF function (bpf_action) that implements the actual action to be taken when
> + * a probe fires.  It prepares a data record to be stored in the tracing buffer
> + * and submits it to the buffer.  The data in the data record is obtained from
> + * the DTrace BPF context.
> + *
> + * Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved.
> + */
> +#include <uapi/linux/bpf.h>
> +#include <linux/ptrace.h>
> +#include <linux/version.h>
> +#include <uapi/linux/unistd.h>
> +#include "bpf_helpers.h"
> +
> +#include "dtrace.h"
> +
> +struct syscall_data {
> +	struct pt_regs *regs;
> +	long syscall_nr;
> +	long arg[6];
> +};
> +
> +struct bpf_map_def SEC("maps") buffers = {
> +	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
> +	.key_size = sizeof(u32),
> +	.value_size = sizeof(u32),
> +	.max_entries = NR_CPUS,
> +};
> +
> +#if defined(__amd64)
> +# define GET_REGS_ARG0(regs)	((regs)->di)
> +# define GET_REGS_ARG1(regs)	((regs)->si)
> +# define GET_REGS_ARG2(regs)	((regs)->dx)
> +# define GET_REGS_ARG3(regs)	((regs)->cx)
> +# define GET_REGS_ARG4(regs)	((regs)->r8)
> +# define GET_REGS_ARG5(regs)	((regs)->r9)
> +#else
> +# warning Argument retrieval from pt_regs is not supported yet on this arch.
> +# define GET_REGS_ARG0(regs)	0
> +# define GET_REGS_ARG1(regs)	0
> +# define GET_REGS_ARG2(regs)	0
> +# define GET_REGS_ARG3(regs)	0
> +# define GET_REGS_ARG4(regs)	0
> +# define GET_REGS_ARG5(regs)	0
> +#endif

We have this in tools/testing/selftests/bpf/bpf_helpers.h, probably need
to move to some other place in tools/include/ where this can be shared.

- Arnaldo

  parent reply	other threads:[~2019-07-08 17:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-04  3:13 [PATCH 0/1] tools/dtrace: initial implementation of DTrace Kris Van Hees
2019-07-04  3:14 ` [PATCH 1/1] " Kris Van Hees
2019-07-04 13:03   ` Peter Zijlstra
2019-07-08 16:48     ` Kris Van Hees
2019-07-04 13:05   ` Peter Zijlstra
2019-07-08 16:38     ` Kris Van Hees
2019-07-04 17:13   ` Brendan Gregg
2019-07-08 17:15   ` Arnaldo Carvalho de Melo [this message]
2019-07-08 22:38     ` Kris Van Hees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190708171537.GA11960@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=kris.van.hees@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).