Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v3 1/3] net: phy: mscc: move shared probe code into a helper
From: Heiko Stübner @ 2020-06-16  9:10 UTC (permalink / raw)
  To: David Miller
  Cc: kuba, robh+dt, andrew, f.fainelli, hkallweit1, linux, netdev,
	devicetree, linux-kernel, christoph.muellner
In-Reply-To: <20200615.181225.2016760272076151342.davem@davemloft.net>

Hi,

Am Dienstag, 16. Juni 2020, 03:12:25 CEST schrieb David Miller:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 15 Jun 2020 18:11:29 -0700 (PDT)
> > +	return devm_phy_package_join(&phydev->mdio.dev, phydev,
> > +				     vsc8531->base_addr, 0);
> 
> But it is still dereferenced here.
> 
> Did the compiler really not warn you about this when you test built
> these changes?

I'm wondering that myself ... it probably did and I overlooked it, which
also is indicated by the fact that  I did add the declaration of the
vsc8531 when rebasing.

> > Because you removed this devm_kzalloc() code, vsc8531 is never initialized.
> 
> You also need to provide a proper header posting when you repost this series
> after fixing this bug.

not sure I understand what you mean with "header posting" here.

Thanks
Heiko



^ permalink raw reply

* Re: [PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()
From: Dan Carpenter @ 2020-06-16  9:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Waiman Long, Jason A . Donenfeld, linux-btrfs, Jarkko Sakkinen,
	David Sterba, David Howells, linux-mm, linux-sctp, keyrings,
	kasan-dev, linux-stm32, devel, linux-cifs, linux-scsi,
	James Morris, Matthew Wilcox, linux-wpan, David Rientjes,
	linux-pm, ecryptfs, linux-fscrypt, linux-mediatek, linux-amlogic,
	virtualization, linux-integrity, linux-nfs, Linus Torvalds,
	linux-wireless, linux-kernel, stable, linux-bluetooth,
	linux-security-module, target-devel, tipc-discussion,
	linux-crypto, Johannes Weiner, Joe Perches, Andrew Morton,
	linuxppc-dev, netdev, wireguard, linux-ppp
In-Reply-To: <20200616064208.GA9499@dhcp22.suse.cz>

On Tue, Jun 16, 2020 at 08:42:08AM +0200, Michal Hocko wrote:
> On Mon 15-06-20 21:57:16, Waiman Long wrote:
> > The kzfree() function is normally used to clear some sensitive
> > information, like encryption keys, in the buffer before freeing it back
> > to the pool. Memset() is currently used for the buffer clearing. However,
> > it is entirely possible that the compiler may choose to optimize away the
> > memory clearing especially if LTO is being used. To make sure that this
> > optimization will not happen, memzero_explicit(), which is introduced
> > in v3.18, is now used in kzfree() to do the clearing.
> > 
> > Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Waiman Long <longman@redhat.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Although I am not really sure this is a stable material. Is there any
> known instance where the memset was optimized out from kzfree?

I told him to add the stable.  Otherwise it will just get reported to
me again.  It's a just safer to backport it before we forget.

regards,
dan carpenter


^ permalink raw reply

* Re: [PATCHv4 bpf-next 0/2] xdp: add dev map multicast support
From: Hangbin Liu @ 2020-06-16  9:47 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Toke Høiland-Jørgensen, bpf, netdev, Jiri Benc,
	Eelco Chaudron, ast, Daniel Borkmann, Lorenzo Bianconi
In-Reply-To: <20200616110922.1219ec5e@carbon>

On Tue, Jun 16, 2020 at 11:09:22AM +0200, Jesper Dangaard Brouer wrote:
> > > BTW, when using pktgen, I got an panic because the skb don't have enough
> > > header room. The code path looks like
> > >
> > > do_xdp_generic()
> > >   - netif_receive_generic_xdp()
> > >     - skb_headroom(skb) < XDP_PACKET_HEADROOM
> > >       - pskb_expand_head()
> > >         - BUG_ON(skb_shared(skb))
> > >
> > > So I added a draft patch for pktgen, not sure if it has any influence.  
> > 
> > Hmm, as Jesper said pktgen was really not intended to be used this way,
> > so I guess that's why. I guess I'll let him comment on whether he thinks
> > it's worth fixing; or you could send this as a proper patch and see if
> > anyone complains about it ;)
> 
> Don't use pktgen in this way with veth.  If anything pktgen should
> detect that you use pktgen in virtual interfaces and reject/disallow
> that you do this.

OK, got it.

Thanks
Hangbin

^ permalink raw reply

* [PATCHv3 0/9] bpf: Add d_path helper
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro

hi,
adding d_path helper to return full path for 'path' object.

I originally added and used 'file_path' helper, which did the same,
but used 'struct file' object. Then realized that file_path is just
a wrapper for d_path, so we'd cover more calling sites if we add
d_path helper and allowed resolving BTF object within another object,
so we could call d_path also with file pointer, like:

  bpf_d_path(&file->f_path, buf, size);

This feature is mainly to be able to add dpath (filepath originally)
function to bpftrace:

  # bpftrace -e 'kfunc:vfs_open { printf("%s\n", dpath(args->path)); }'

v3 changes:
  - changed tests to use seleton and vmlinux.h [Andrii]
  - refactored to define ID lists in C object [Andrii]
  - changed btf_struct_access for nested ID check,
    instead of adding new function for that [Andrii]
  - fail build with CONFIG_DEBUG_INFO_BTF if libelf is not detected [Andrii]

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/d_path

thanks,
jirka


---
Jiri Olsa (11):
      bpf: Add btfid tool to resolve BTF IDs in ELF object
      bpf: Compile btfid tool at kernel compilation start
      bpf: Add btf_ids object
      bpf: Resolve BTF IDs in vmlinux image
      bpf: Remove btf_id helpers resolving
      bpf: Do not pass enum bpf_access_type to btf_struct_access
      bpf: Allow nested BTF object to be refferenced by BTF object + offset
      bpf: Add BTF whitelist support
      bpf: Add d_path helper
      selftests/bpf: Add verifier test for d_path helper
      selftests/bpf: Add test for d_path helper

 Makefile                                        |  25 ++++-
 include/asm-generic/vmlinux.lds.h               |   4 +
 include/linux/bpf.h                             |  16 ++-
 include/uapi/linux/bpf.h                        |  14 ++-
 kernel/bpf/Makefile                             |   2 +-
 kernel/bpf/btf.c                                | 149 +++++++++++--------------
 kernel/bpf/btf_ids.c                            |  26 +++++
 kernel/bpf/btf_ids.h                            | 108 ++++++++++++++++++
 kernel/bpf/verifier.c                           |  39 +++++--
 kernel/trace/bpf_trace.c                        |  40 ++++++-
 net/core/filter.c                               |   2 -
 net/ipv4/bpf_tcp_ca.c                           |   2 +-
 scripts/bpf_helpers_doc.py                      |   2 +
 scripts/link-vmlinux.sh                         |   6 +
 tools/Makefile                                  |   3 +
 tools/bpf/Makefile                              |   5 +-
 tools/bpf/btfid/Build                           |  26 +++++
 tools/bpf/btfid/Makefile                        |  71 ++++++++++++
 tools/bpf/btfid/btfid.c                         | 627 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h                  |  14 ++-
 tools/testing/selftests/bpf/prog_tests/d_path.c | 153 +++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/test_d_path.c |  55 +++++++++
 tools/testing/selftests/bpf/test_verifier.c     |  13 ++-
 tools/testing/selftests/bpf/verifier/d_path.c   |  38 +++++++
 24 files changed, 1329 insertions(+), 111 deletions(-)
 create mode 100644 kernel/bpf/btf_ids.c
 create mode 100644 kernel/bpf/btf_ids.h
 create mode 100644 tools/bpf/btfid/Build
 create mode 100644 tools/bpf/btfid/Makefile
 create mode 100644 tools/bpf/btfid/btfid.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/d_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_d_path.c
 create mode 100644 tools/testing/selftests/bpf/verifier/d_path.c


^ permalink raw reply

* [PATCH 01/11] bpf: Add btfid tool to resolve BTF IDs in ELF object
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

The btfid tool scans Elf object for .BTF_ids section and
resolves its symbols with BTF IDs.

It will be used to during linking time to resolve arrays
of BTF IDs used in verifier, so these IDs do not need to
be resolved in runtime.

The expected layout of .BTF_ids section is described
in btfid.c header. Related kernel changes are coming in
following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/bpf/btfid/Build    |  26 ++
 tools/bpf/btfid/Makefile |  71 +++++
 tools/bpf/btfid/btfid.c  | 627 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 724 insertions(+)
 create mode 100644 tools/bpf/btfid/Build
 create mode 100644 tools/bpf/btfid/Makefile
 create mode 100644 tools/bpf/btfid/btfid.c

diff --git a/tools/bpf/btfid/Build b/tools/bpf/btfid/Build
new file mode 100644
index 000000000000..12d43396d2a0
--- /dev/null
+++ b/tools/bpf/btfid/Build
@@ -0,0 +1,26 @@
+btfid-y += btfid.o
+btfid-y += rbtree.o
+btfid-y += zalloc.o
+btfid-y += string.o
+btfid-y += ctype.o
+btfid-y += str_error_r.o
+
+$(OUTPUT)rbtree.o: ../../lib/rbtree.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
+
+$(OUTPUT)zalloc.o: ../../lib/zalloc.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
+
+$(OUTPUT)string.o: ../../lib/string.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
+
+$(OUTPUT)ctype.o: ../../lib/ctype.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
+
+$(OUTPUT)str_error_r.o: ../../lib/str_error_r.c FORCE
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
diff --git a/tools/bpf/btfid/Makefile b/tools/bpf/btfid/Makefile
new file mode 100644
index 000000000000..30b721cf0a21
--- /dev/null
+++ b/tools/bpf/btfid/Makefile
@@ -0,0 +1,71 @@
+# SPDX-License-Identifier: GPL-2.0-only
+include ../../scripts/Makefile.include
+
+MAKEFLAGS=--no-print-directory
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(CURDIR)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+
+ifeq ($(V),1)
+  Q =
+else
+  Q = @
+endif
+
+BPF_DIR       = $(srctree)/tools/lib/bpf/
+SUBCMD_DIR    = $(srctree)/tools/lib/subcmd/
+SUBCMD_OUTPUT = $(if $(OUTPUT),$(OUTPUT),$(CURDIR)/)
+
+ifneq ($(OUTPUT),)
+  LIBBPF_PATH = $(OUTPUT)/libbpf/
+  SUBCMD_PATH = $(OUTPUT)/subcmd/
+else
+  LIBBPF_PATH = $(BPF_DIR)
+  SUBCMD_PATH = $(SUBCMD_DIR)
+endif
+
+LIBSUBCMD = $(SUBCMD_OUTPUT)libsubcmd.a
+LIBBPF    = $(LIBBPF_PATH)libbpf.a
+BPFWL     = $(OUTPUT)btfid
+BPFWL_IN  = $(BPFWL)-in.o
+
+all: $(OUTPUT)btfid
+
+$(LIBSUBCMD): fixdep FORCE
+	$(Q)$(MAKE) -C $(SUBCMD_DIR) OUTPUT=$(SUBCMD_OUTPUT)
+
+$(LIBSUBCMD)-clean:
+	$(Q)$(MAKE) -C $(SUBCMD_DIR) O=$(OUTPUT) clean
+
+$(LIBBPF): FORCE
+	$(if $(LIBBPF_PATH),@mkdir -p $(LIBBPF_PATH))
+	$(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_PATH) $(LIBBPF_PATH)libbpf.a
+
+$(LIBBPF)-clean:
+	$(call QUIET_CLEAN, libbpf)
+	$(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_PATH) clean >/dev/null
+
+CFLAGS := -g -I$(srctree)/tools/include -I$(BPF_DIR) -I$(SUBCMD_DIR)
+
+LIBS = -lelf -lz
+
+export srctree OUTPUT CFLAGS
+include $(srctree)/tools/build/Makefile.include
+
+$(BPFWL_IN): fixdep FORCE
+	$(Q)$(MAKE) $(build)=btfid
+
+$(BPFWL): $(LIBBPF) $(LIBSUBCMD) $(BPFWL_IN)
+	$(QUIET_LINK)$(CC) $(BPFWL_IN) $(LDFLAGS) -o $@ $(LIBBPF) $(LIBSUBCMD) $(LIBS)
+
+clean: $(LIBBPF)-clean $(LIBSUBCMD)-clean
+	$(call QUIET_CLEAN, btfid)
+	$(Q)$(RM) -f $(BPFWL)
+	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
+
+FORCE:
+
+.PHONY: all FORCE clean
diff --git a/tools/bpf/btfid/btfid.c b/tools/bpf/btfid/btfid.c
new file mode 100644
index 000000000000..7cdf39bfb150
--- /dev/null
+++ b/tools/bpf/btfid/btfid.c
@@ -0,0 +1,627 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+#define  _GNU_SOURCE
+
+/*
+ * btfid scans Elf object for .BTF_ids section and resolves
+ * its symbols with BTF IDs.
+ *
+ * Each symbol points to 4 bytes data and is expected to have
+ * following name syntax:
+ *
+ * __BTF_ID__<type>__<symbol>[__<id>]
+ *
+ * type is:
+ *
+ *   func   - lookup BTF_KIND_FUNC symbol with <symbol> name
+ *            and put its ID into its data
+ *
+ *             __BTF_ID__func__vfs_close__1:
+ *             .zero 4
+ *
+ *   struct - lookup BTF_KIND_STRUCT symbol with <symbol> name
+ *            and put its ID into its data
+ *
+ *             __BTF_ID__struct__sk_buff__1:
+ *             .zero 4
+ *
+ *   sort   - put symbol size into data area and sort following
+ *            ID list
+ *
+ *             __BTF_ID__sort__list:
+ *             list_cnt:
+ *             .zero 4
+ *             list:
+ *             __BTF_ID__func__vfs_getattr__3:
+ *             .zero 4
+ *             __BTF_ID__func__vfs_fallocate__4:
+ *             .zero 4
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <libelf.h>
+#include <gelf.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <linux/rbtree.h>
+#include <linux/zalloc.h>
+#include <btf.h>
+#include <libbpf.h>
+#include <parse-options.h>
+
+#define ADDR_CNT	100
+#define SECTION		".BTF_ids"
+#define BTF_STRUCT	"struct"
+#define BTF_FUNC	"func"
+#define BTF_SORT	"sort"
+#define BTF_ID		"__BTF_ID__"
+
+struct btf_id {
+	struct rb_node	 rb_node;
+	char		*name;
+	union {
+		int	 id;
+		int	 cnt;
+	};
+	int		 addr_cnt;
+	Elf64_Addr	 addr[ADDR_CNT];
+};
+
+struct object {
+	const char *path;
+
+	struct {
+		int		 fd;
+		Elf		*elf;
+		Elf_Data	*symbols;
+		Elf_Data	*idlist;
+		int		 symbols_shndx;
+		int		 idlist_shndx;
+		size_t		 strtabidx;
+		unsigned long	 idlist_addr;
+	} efile;
+
+	struct rb_root	sorts;
+	struct rb_root	funcs;
+	struct rb_root	structs;
+
+	int nr_funcs;
+	int nr_structs;
+};
+
+static int verbose;
+
+int eprintf(int level, int var, const char *fmt, ...)
+{
+	va_list args;
+	int ret;
+
+	if (var >= level) {
+		va_start(args, fmt);
+		ret = vfprintf(stderr, fmt, args);
+		va_end(args);
+	}
+	return ret;
+}
+
+#ifndef pr_fmt
+#define pr_fmt(fmt) fmt
+#endif
+
+#define pr_debug(fmt, ...) \
+	eprintf(1, verbose, pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_debugN(n, fmt, ...) \
+	eprintf(n, verbose, pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_debug2(fmt, ...) pr_debugN(2, pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_err(fmt, ...) \
+	eprintf(0, verbose, pr_fmt(fmt), ##__VA_ARGS__)
+
+static bool is_btf_id(const char *name)
+{
+	return name && !strncmp(name, BTF_ID, sizeof(BTF_ID) - 1);
+}
+
+static struct btf_id *btf_id__find(struct rb_root *root, const char *name)
+{
+	struct rb_node *p = root->rb_node;
+	struct btf_id *id;
+	int cmp;
+
+	while (p) {
+		id = rb_entry(p, struct btf_id, rb_node);
+		cmp = strcmp(id->name, name);
+		if (cmp < 0)
+			p = p->rb_left;
+		else if (cmp > 0)
+			p = p->rb_right;
+		else
+			return id;
+	}
+	return NULL;
+}
+
+static struct btf_id*
+btf_id__add(struct rb_root *root, char *name, bool unique)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *parent = NULL;
+	struct btf_id *id;
+	int cmp;
+
+	while (*p != NULL) {
+		parent = *p;
+		id = rb_entry(parent, struct btf_id, rb_node);
+		cmp = strcmp(id->name, name);
+		if (cmp < 0)
+			p = &(*p)->rb_left;
+		else if (cmp > 0)
+			p = &(*p)->rb_right;
+		else
+			return unique ? NULL : id;
+	}
+
+	id = zalloc(sizeof(*id));
+	if (id) {
+		pr_debug("adding symbol %s\n", name);
+		id->name = name;
+		rb_link_node(&id->rb_node, parent, p);
+		rb_insert_color(&id->rb_node, root);
+	}
+	return id;
+}
+
+static char *get_id(const char *prefix_end)
+{
+	/*
+	 * __BTF_ID__func__vfs_truncate__0
+	 * prefix_end =  ^
+	 */
+	char *p, *id = strdup(prefix_end + sizeof("__") - 1);
+
+	if (id) {
+		/*
+		 * __BTF_ID__func__vfs_truncate__0
+		 * id =            ^
+		 *
+		 * cut the unique id part
+		 */
+		p = strrchr(id, '_');
+		p--;
+		if (*p != '_') {
+			free(id);
+			return NULL;
+		}
+		*p = '\0';
+	}
+	return id;
+}
+
+static struct btf_id *add_sort(struct object *obj, char *name)
+{
+	char *id;
+
+	id = strdup(name + sizeof(BTF_SORT) + sizeof("__") - 2);
+	if (!id) {
+		pr_err("FAILED to parse cnt name: %s\n", name);
+		return NULL;
+	}
+
+	return btf_id__add(&obj->sorts, id, true);
+}
+
+static struct btf_id *add_func(struct object *obj, char *name)
+{
+	char *id;
+
+	id = get_id(name + sizeof(BTF_FUNC) - 1);
+	if (!id) {
+		pr_err("FAILED to parse func name: %s\n", name);
+		return NULL;
+	}
+
+	obj->nr_funcs++;
+	return btf_id__add(&obj->funcs, id, false);
+}
+
+static struct btf_id *add_struct(struct object *obj, char *name)
+{
+	char *id;
+
+	id = get_id(name + sizeof(BTF_STRUCT) - 1);
+	if (!id) {
+		pr_err("FAILED to parse struct name: %s\n", name);
+		return NULL;
+	}
+
+	obj->nr_structs++;
+	return btf_id__add(&obj->structs, id, false);
+}
+
+static int elf_collect(struct object *obj)
+{
+	Elf_Scn *scn = NULL;
+	size_t shdrstrndx;
+	int idx = 0;
+	Elf *elf;
+	int fd;
+
+	fd = open(obj->path, O_RDWR, 0666);
+	if (fd == -1) {
+		pr_err("FAILED cannot open %s: %s\n",
+			obj->path, strerror(errno));
+		return -1;
+	}
+
+	elf_version(EV_CURRENT);
+
+	elf = elf_begin(fd, ELF_C_RDWR_MMAP, NULL);
+	if (!elf) {
+		pr_err("FAILED cannot create ELF descriptor: %s\n",
+			elf_errmsg(-1));
+		return -1;
+	}
+
+	obj->efile.fd  = fd;
+	obj->efile.elf = elf;
+
+	elf_flagelf(elf, ELF_C_SET, ELF_F_LAYOUT);
+
+	if (elf_getshdrstrndx(elf, &shdrstrndx) != 0) {
+		pr_err("FAILED cannot get shdr str ndx\n");
+		return -1;
+	}
+
+	/*
+	 * Scan all the elf sections and look for save data
+	 * from .BTF_ids section and symbols.
+	 */
+	while ((scn = elf_nextscn(elf, scn)) != NULL) {
+		Elf_Data *data;
+		GElf_Shdr sh;
+		char *name;
+
+		idx++;
+		if (gelf_getshdr(scn, &sh) != &sh) {
+			pr_err("FAILED get section(%d) header\n", idx);
+			return -1;
+		}
+
+		name = elf_strptr(elf, shdrstrndx, sh.sh_name);
+		if (!name) {
+			pr_err("FAILED get section(%d) name\n", idx);
+			return -1;
+		}
+
+		data = elf_getdata(scn, 0);
+		if (!data) {
+			pr_err("failed to get section(%d) data from %s\n",
+				idx, name);
+			return -1;
+		}
+
+		pr_debug2("section(%d) %s, size %ld, link %d, flags %lx, type=%d\n",
+			  idx, name, (unsigned long) data->d_size,
+			  (int) sh.sh_link, (unsigned long) sh.sh_flags,
+			  (int) sh.sh_type);
+
+		if (sh.sh_type == SHT_SYMTAB) {
+			obj->efile.symbols       = data;
+			obj->efile.symbols_shndx = idx;
+			obj->efile.strtabidx     = sh.sh_link;
+		} else if (!strcmp(name, SECTION)) {
+			obj->efile.idlist       = data;
+			obj->efile.idlist_shndx = idx;
+			obj->efile.idlist_addr  = sh.sh_addr;
+		}
+	}
+
+	/*
+	 * We did not find .BTF_ids section or
+	 * symbols section, nothing to do..
+	 */
+	if (obj->efile.idlist_shndx == -1 ||
+	    obj->efile.symbols_shndx == -1) {
+		pr_err("FAILED to find needed sections\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int symbols_collect(struct object *obj)
+{
+	Elf_Scn *scn = NULL;
+	int n, i, err = 0;
+	GElf_Shdr sh;
+	char *name;
+
+	scn = elf_getscn(obj->efile.elf, obj->efile.symbols_shndx);
+	if (!scn)
+		return -1;
+
+	if (gelf_getshdr(scn, &sh) != &sh)
+		return -1;
+
+	n = sh.sh_size / sh.sh_entsize;
+
+	/*
+	 * Scan symbols and look for the ones starting with
+	 * __BTF_ID__* over .BTF_ids section.
+	 */
+	for (i = 0; !err && i < n; i++) {
+		char *tmp, *prefix;
+		struct btf_id *id;
+		GElf_Sym sym;
+		int err = -1;
+
+		if (!gelf_getsym(obj->efile.symbols, i, &sym))
+			return -1;
+
+		if (sym.st_shndx != obj->efile.idlist_shndx)
+			continue;
+
+		name = elf_strptr(obj->efile.elf, obj->efile.strtabidx,
+				  sym.st_name);
+
+		if (!is_btf_id(name))
+			continue;
+
+		/*
+		 * __BTF_ID__TYPE__vfs_truncate__0
+		 * prefix =  ^
+		 */
+		prefix = name + sizeof(BTF_ID) - 1;
+
+		if (!strncmp(prefix, BTF_STRUCT, sizeof(BTF_STRUCT) - 1)) {
+			id = add_struct(obj, prefix);
+		} else if (!strncmp(prefix, BTF_FUNC, sizeof(BTF_FUNC) - 1)) {
+			id = add_func(obj, prefix);
+		} else if (!strncmp(prefix, BTF_SORT, sizeof(BTF_SORT) - 1)) {
+			id = add_sort(obj, prefix);
+
+			/*
+			 * SORT objects store list's count, which is encoded
+			 * in symbol's size.
+			 */
+			if (id)
+				id->cnt = sym.st_size / sizeof(int);
+		} else {
+			pr_err("FAILED unsupported prefix %s\n", prefix);
+			return -1;
+		}
+
+		if (!id)
+			return -ENOMEM;
+
+		if (id->addr_cnt >= ADDR_CNT) {
+			pr_err("FAILED symbol %s crossed the number of allowed lists",
+				id->name);
+			return -1;
+		}
+		id->addr[id->addr_cnt++] = sym.st_value;
+	}
+
+	return 0;
+}
+
+static int symbols_resolve(struct object *obj)
+{
+	int nr_structs = obj->nr_structs;
+	int nr_funcs   = obj->nr_funcs;
+	struct btf *btf;
+	int err, type_id;
+	__u32 nr;
+
+	btf = btf__parse_elf(obj->path, NULL);
+	err = libbpf_get_error(btf);
+	if (err) {
+		pr_err("FAILED: load BTF from %s: %s",
+			obj->path, strerror(err));
+		return -1;
+	}
+
+	nr = btf__get_nr_types(btf);
+
+	/*
+	 * Iterate all the BTF types and search for collected symbol IDs.
+	 */
+	for (type_id = 0; type_id < nr; type_id++) {
+		const struct btf_type *type;
+		struct rb_root *root = NULL;
+		struct btf_id *id;
+		const char *str;
+		int *nr;
+
+		type = btf__type_by_id(btf, type_id);
+		if (!type)
+			continue;
+
+		/* We support func/struct types. */
+		if (BTF_INFO_KIND(type->info) == BTF_KIND_FUNC && nr_funcs) {
+			root = &obj->funcs;
+			nr = &nr_funcs;
+		} else if (BTF_INFO_KIND(type->info) == BTF_KIND_STRUCT && nr_structs) {
+			root = &obj->structs;
+			nr = &nr_structs;
+		} else {
+			continue;
+		}
+
+		str = btf__name_by_offset(btf, type->name_off);
+		if (!str)
+			continue;
+
+		id = btf_id__find(root, str);
+		if (id) {
+			id->id = type_id;
+			(*nr)--;
+		}
+	}
+
+	return 0;
+}
+
+static int id_patch(struct object *obj, struct btf_id *id)
+{
+	Elf_Data *data = obj->efile.idlist;
+	int *ptr = data->d_buf;
+	int i;
+
+	if (!id->id) {
+		pr_err("FAILED unresolved symbol %s\n", id->name);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < id->addr_cnt; i++) {
+		unsigned long addr = id->addr[i];
+		unsigned long idx = addr - obj->efile.idlist_addr;
+
+		pr_debug("patching addr %5lu: ID %7d [%s]\n", idx, id->id, id->name);
+
+		if (idx >= data->d_size) {
+			pr_err("FAILED patching index %lu out of bounds %lu\n",
+				idx, data->d_size);
+			return -1;
+		}
+
+		idx = idx / sizeof(int);
+		ptr[idx] = id->id;
+	}
+
+	return 0;
+}
+
+static int __symbols_patch(struct object *obj, struct rb_root *root)
+{
+	struct rb_node *next;
+	struct btf_id *id;
+
+	next = rb_first(root);
+	while (next) {
+		id = rb_entry(next, struct btf_id, rb_node);
+
+		if (id_patch(obj, id))
+			return -1;
+
+		next = rb_next(next);
+	}
+	return 0;
+}
+
+static int cmp_id(const void *pa, const void *pb)
+{
+	const int *a = pa, *b = pb;
+
+	return *a - *b;
+}
+
+static int sorts_patch(struct object *obj)
+{
+	Elf_Data *data = obj->efile.idlist;
+	int *ptr = data->d_buf;
+	struct rb_node *next;
+	struct btf_id *id;
+
+	next = rb_first(&obj->sorts);
+	while (next) {
+		unsigned long addr = id->addr[0];
+		unsigned long idx = addr - obj->efile.idlist_addr;
+		int *base;
+		int cnt;
+
+		id = rb_entry(next, struct btf_id, rb_node);
+
+		if (id->addr_cnt != 1)
+			return -1;
+
+		idx = idx / sizeof(int);
+		base = &ptr[idx] + 1;
+		cnt = ptr[idx];
+
+		pr_debug("sorting  addr %5lu: cnt %6d [%s]\n",
+			 (idx + 1) * sizeof(int), cnt, id->name);
+
+		qsort(base, cnt, sizeof(int), cmp_id);
+
+		next = rb_next(next);
+	}
+}
+
+static int symbols_patch(struct object *obj)
+{
+	int err;
+
+	if (__symbols_patch(obj, &obj->funcs) ||
+	    __symbols_patch(obj, &obj->structs) ||
+	    __symbols_patch(obj, &obj->sorts))
+		return -1;
+
+	if (sorts_patch(obj))
+		return -1;
+
+	elf_flagdata(obj->efile.idlist, ELF_C_SET, ELF_F_DIRTY);
+
+	err = elf_update(obj->efile.elf, ELF_C_WRITE);
+	if (err < 0) {
+		pr_err("FAILED elf_update(WRITE): %s\n",
+			elf_errmsg(-1));
+	}
+
+	pr_debug("update %s for %s\n",
+		 err >= 0 ? "ok" : "failed", obj->path);
+	return err < 0 ? -1 : 0;
+}
+
+static const char * const btfid_usage[] = {
+	"btfid [<options>] <ELF object>",
+	NULL
+};
+
+static struct option btfid_options[] = {
+	OPT_INCR('v', "verbose", &verbose,
+		 "be more verbose (show errors, etc)"),
+	OPT_END()
+};
+
+int main(int argc, const char **argv)
+{
+	struct object obj = {
+		.efile = {
+			.idlist_shndx  = -1,
+			.symbols_shndx = -1,
+		},
+		.funcs   = RB_ROOT,
+		.structs = RB_ROOT,
+		.sorts   = RB_ROOT,
+	};
+
+	argc = parse_options(argc, argv, btfid_options, btfid_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (argc != 1)
+		usage_with_options(btfid_usage, btfid_options);
+
+	obj.path = argv[0];
+
+	/*
+	 * We do proper cleanup and file close
+	 * intentionally only on success.
+	 */
+	if (elf_collect(&obj))
+		return -1;
+
+	if (symbols_collect(&obj))
+		return -1;
+
+	if (symbols_resolve(&obj))
+		return -1;
+
+	if (symbols_patch(&obj))
+		return -1;
+
+	elf_end(obj.efile.elf);
+	close(obj.efile.fd);
+	return 0;
+}
-- 
2.25.4


^ permalink raw reply related

* [PATCH 02/11] bpf: Compile btfid tool at kernel compilation start
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

The btfid tool will be used during the vmlinux linking,
so it's necessary it's ready for it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 Makefile           | 22 ++++++++++++++++++----
 tools/Makefile     |  3 +++
 tools/bpf/Makefile |  5 ++++-
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 839f9fee22cb..b190d502d7d7 100644
--- a/Makefile
+++ b/Makefile
@@ -1066,9 +1066,10 @@ export mod_sign_cmd
 
 HOST_LIBELF_LIBS = $(shell pkg-config libelf --libs 2>/dev/null || echo -lelf)
 
+has_libelf = $(call try-run,\
+               echo "int main() {}" | $(HOSTCC) -xc -o /dev/null $(HOST_LIBELF_LIBS) -,1,0)
+
 ifdef CONFIG_STACK_VALIDATION
-  has_libelf := $(call try-run,\
-		echo "int main() {}" | $(HOSTCC) -xc -o /dev/null $(HOST_LIBELF_LIBS) -,1,0)
   ifeq ($(has_libelf),1)
     objtool_target := tools/objtool FORCE
   else
@@ -1077,6 +1078,14 @@ ifdef CONFIG_STACK_VALIDATION
   endif
 endif
 
+ifdef CONFIG_DEBUG_INFO_BTF
+  ifeq ($(has_libelf),1)
+    btfid_target := tools/bpf/btfid FORCE
+  else
+    ERROR_BTF_IDS_RESOLVE := 1
+  endif
+endif
+
 PHONY += prepare0
 
 export MODORDER := $(extmod-prefix)modules.order
@@ -1188,7 +1197,7 @@ prepare0: archprepare
 	$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
-prepare: prepare0 prepare-objtool
+prepare: prepare0 prepare-objtool prepare-btfid
 
 # Support for using generic headers in asm-generic
 asm-generic := -f $(srctree)/scripts/Makefile.asm-generic obj
@@ -1201,7 +1210,7 @@ uapi-asm-generic:
 	$(Q)$(MAKE) $(asm-generic)=arch/$(SRCARCH)/include/generated/uapi/asm \
 	generic=include/uapi/asm-generic
 
-PHONY += prepare-objtool
+PHONY += prepare-objtool prepare-btfid
 prepare-objtool: $(objtool_target)
 ifeq ($(SKIP_STACK_VALIDATION),1)
 ifdef CONFIG_UNWINDER_ORC
@@ -1212,6 +1221,11 @@ else
 endif
 endif
 
+prepare-btfid: $(btfid_target)
+ifeq ($(ERROR_BTF_IDS_RESOLVE),1)
+	@echo "error: Cannot resolve BTF IDs for CONFIG_DEBUG_INFO_BTF, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2
+	@false
+endif
 # Generate some files
 # ---------------------------------------------------------------------------
 
diff --git a/tools/Makefile b/tools/Makefile
index bd778812e915..85af6ebbce91 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -67,6 +67,9 @@ cpupower: FORCE
 cgroup firewire hv guest bootconfig spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
 	$(call descend,$@)
 
+bpf/%: FORCE
+	$(call descend,$@)
+
 liblockdep: FORCE
 	$(call descend,lib/lockdep)
 
diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile
index 77472e28c8fd..d8bbe7ef264f 100644
--- a/tools/bpf/Makefile
+++ b/tools/bpf/Makefile
@@ -124,5 +124,8 @@ runqslower_install:
 runqslower_clean:
 	$(call descend,runqslower,clean)
 
+btfid:
+	$(call descend,btfid)
+
 .PHONY: all install clean bpftool bpftool_install bpftool_clean \
-	runqslower runqslower_install runqslower_clean
+	runqslower runqslower_install runqslower_clean btfid
-- 
2.25.4


^ permalink raw reply related

* [PATCH 03/11] bpf: Add btf_ids object
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Andrii Nakryiko, netdev, bpf, Song Liu, Yonghong Song,
	Martin KaFai Lau, David Miller, John Fastabend, Wenbo Zhang,
	KP Singh, Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding support to generate .BTF_ids section that would
hold various BTF IDs list for verifier.

Adding macros help to define lists of BTF IDs placed in
.BTF_ids section. They are initially filled with zeros
(during compilation) and resolved later during the
linking phase by btfid tool.

Following defines list of one BTF ID that is accessible
within kernel code as bpf_skb_output_btf_ids array.

  extern int bpf_skb_output_btf_ids[];

  BTF_ID_LIST(bpf_skb_output_btf_ids)
  BTF_ID(struct, sk_buff)

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/asm-generic/vmlinux.lds.h |  4 ++
 kernel/bpf/Makefile               |  2 +-
 kernel/bpf/btf_ids.c              |  3 ++
 kernel/bpf/btf_ids.h              | 70 +++++++++++++++++++++++++++++++
 4 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/btf_ids.c
 create mode 100644 kernel/bpf/btf_ids.h

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index db600ef218d7..0be2ee265931 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -641,6 +641,10 @@
 		__start_BTF = .;					\
 		*(.BTF)							\
 		__stop_BTF = .;						\
+	}								\
+	. = ALIGN(4);							\
+	.BTF_ids : AT(ADDR(.BTF_ids) - LOAD_OFFSET) {			\
+		*(.BTF_ids)						\
 	}
 #else
 #define BTF
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 1131a921e1a6..21e4fc7c25ab 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -7,7 +7,7 @@ obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_JIT) += trampoline.o
-obj-$(CONFIG_BPF_SYSCALL) += btf.o
+obj-$(CONFIG_BPF_SYSCALL) += btf.o btf_ids.o
 obj-$(CONFIG_BPF_JIT) += dispatcher.o
 ifeq ($(CONFIG_NET),y)
 obj-$(CONFIG_BPF_SYSCALL) += devmap.o
diff --git a/kernel/bpf/btf_ids.c b/kernel/bpf/btf_ids.c
new file mode 100644
index 000000000000..e7f9d94ad293
--- /dev/null
+++ b/kernel/bpf/btf_ids.c
@@ -0,0 +1,3 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "btf_ids.h"
diff --git a/kernel/bpf/btf_ids.h b/kernel/bpf/btf_ids.h
new file mode 100644
index 000000000000..68aa5c38a37f
--- /dev/null
+++ b/kernel/bpf/btf_ids.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __BTF_IDS_H__
+#define __BTF_IDS_H__
+
+#include <linux/stringify.h>
+#include <linux/compiler.h>
+#include <linux/linkage.h>
+
+
+/*
+ * Following macros help to define lists of BTF IDs placed
+ * in .BTF_ids section. They are initially filled with zeros
+ * (during compilation) and resolved later during the
+ * linking phase by btfid tool.
+ *
+ * Any change in list layout must be reflected in btfid
+ * tool logic.
+ */
+
+#define SECTION ".BTF_ids"
+
+#define ____BTF_ID(symbol)				\
+asm(							\
+".pushsection " SECTION ",\"a\";               \n"	\
+".local " #symbol " ;                          \n"	\
+".type  " #symbol ", @object;                  \n"	\
+".size  " #symbol ", 4;                        \n"	\
+#symbol ":                                     \n"	\
+".zero 4                                       \n"	\
+".popsection;                                  \n");
+
+#define __BTF_ID(...) \
+	____BTF_ID(__VA_ARGS__)
+
+#define __ID(prefix) \
+	__PASTE(prefix, __COUNTER__)
+
+
+/*
+ * The BTF_ID defines unique symbol for each ID pointing
+ * to 4 zero bytes.
+ */
+#define BTF_ID(prefix, name) \
+	__BTF_ID(__ID(__BTF_ID__##prefix##__##name##__))
+
+
+/*
+ * The BTF_ID_LIST macro defines pure (unsorted) list
+ * of BTF IDs, with following layout:
+ *
+ * BTF_ID_LIST(list1)
+ * BTF_ID(type1, name1)
+ * BTF_ID(type2, name2)
+ *
+ * list1:
+ * __BTF_ID__type1__name1__1:
+ * .zero 4
+ * __BTF_ID__type2__name2__2:
+ * .zero 4
+ *
+ */
+#define BTF_ID_LIST(name)				\
+asm(							\
+".pushsection " SECTION ",\"a\";               \n"	\
+".global " #name ";                            \n"	\
+#name ":;                                      \n"	\
+".popsection;                                  \n");
+
+#endif
-- 
2.25.4


^ permalink raw reply related

* [PATCH 04/11] bpf: Resolve BTF IDs in vmlinux image
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Run btfid on vmlinux object during linking, so the
.BTF_ids section is processed and IDs are resolved.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 Makefile                 |  3 ++-
 include/linux/bpf.h      |  5 +++++
 kernel/bpf/btf_ids.c     | 12 ++++++++++++
 kernel/trace/bpf_trace.c |  2 --
 net/core/filter.c        |  2 --
 scripts/link-vmlinux.sh  |  6 ++++++
 6 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index b190d502d7d7..889d909fd71a 100644
--- a/Makefile
+++ b/Makefile
@@ -448,6 +448,7 @@ OBJSIZE		= $(CROSS_COMPILE)size
 STRIP		= $(CROSS_COMPILE)strip
 endif
 PAHOLE		= pahole
+BTFID		= $(srctree)/tools/bpf/btfid/btfid
 LEX		= flex
 YACC		= bison
 AWK		= awk
@@ -524,7 +525,7 @@ GCC_PLUGINS_CFLAGS :=
 CLANG_FLAGS :=
 
 export ARCH SRCARCH CONFIG_SHELL BASH HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE LD CC
-export CPP AR NM STRIP OBJCOPY OBJDUMP OBJSIZE READELF PAHOLE LEX YACC AWK INSTALLKERNEL
+export CPP AR NM STRIP OBJCOPY OBJDUMP OBJSIZE READELF PAHOLE BTFID LEX YACC AWK INSTALLKERNEL
 export PERL PYTHON PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
 export _GZIP _BZIP2 _LZOP LZMA LZ4 XZ
 export KBUILD_HOSTCXXFLAGS KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS LDFLAGS_MODULE
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 07052d44bca1..f18c23dcc858 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1743,4 +1743,9 @@ enum bpf_text_poke_type {
 int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 		       void *addr1, void *addr2);
 
+extern int bpf_skb_output_btf_ids[];
+extern int bpf_seq_printf_btf_ids[];
+extern int bpf_seq_write_btf_ids[];
+extern int bpf_xdp_output_btf_ids[];
+
 #endif /* _LINUX_BPF_H */
diff --git a/kernel/bpf/btf_ids.c b/kernel/bpf/btf_ids.c
index e7f9d94ad293..d8d0df162f04 100644
--- a/kernel/bpf/btf_ids.c
+++ b/kernel/bpf/btf_ids.c
@@ -1,3 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include "btf_ids.h"
+
+BTF_ID_LIST(bpf_skb_output_btf_ids)
+BTF_ID(struct, sk_buff)
+
+BTF_ID_LIST(bpf_seq_printf_btf_ids)
+BTF_ID(struct, seq_file)
+
+BTF_ID_LIST(bpf_seq_write_btf_ids)
+BTF_ID(struct, seq_file)
+
+BTF_ID_LIST(bpf_xdp_output_btf_ids)
+BTF_ID(struct, xdp_buff)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3744372a24e2..c1866d76041f 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -667,7 +667,6 @@ BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
 	return err;
 }
 
-static int bpf_seq_printf_btf_ids[5];
 static const struct bpf_func_proto bpf_seq_printf_proto = {
 	.func		= bpf_seq_printf,
 	.gpl_only	= true,
@@ -685,7 +684,6 @@ BPF_CALL_3(bpf_seq_write, struct seq_file *, m, const void *, data, u32, len)
 	return seq_write(m, data, len) ? -EOVERFLOW : 0;
 }
 
-static int bpf_seq_write_btf_ids[5];
 static const struct bpf_func_proto bpf_seq_write_proto = {
 	.func		= bpf_seq_write,
 	.gpl_only	= true,
diff --git a/net/core/filter.c b/net/core/filter.c
index 209482a4eaa2..440e52061be8 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3775,7 +3775,6 @@ static const struct bpf_func_proto bpf_skb_event_output_proto = {
 	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
 };
 
-static int bpf_skb_output_btf_ids[5];
 const struct bpf_func_proto bpf_skb_output_proto = {
 	.func		= bpf_skb_event_output,
 	.gpl_only	= true,
@@ -4169,7 +4168,6 @@ static const struct bpf_func_proto bpf_xdp_event_output_proto = {
 	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
 };
 
-static int bpf_xdp_output_btf_ids[5];
 const struct bpf_func_proto bpf_xdp_output_proto = {
 	.func		= bpf_xdp_event_output,
 	.gpl_only	= true,
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 57cb14bd8925..99a3f8c65e84 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -336,6 +336,12 @@ fi
 
 vmlinux_link vmlinux "${kallsymso}" ${btf_vmlinux_bin_o}
 
+# fill in BTF IDs
+if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
+info BTFID vmlinux
+${BTFID} vmlinux
+fi
+
 if [ -n "${CONFIG_BUILDTIME_TABLE_SORT}" ]; then
 	info SORTTAB vmlinux
 	if ! sorttable vmlinux; then
-- 
2.25.4


^ permalink raw reply related

* [PATCH] e1000e: continue to init phy even when failed to disable ULP
From: Aaron Ma @ 2020-06-16 10:05 UTC (permalink / raw)
  To: jeffrey.t.kirsher, davem, kuba, intel-wired-lan, netdev,
	linux-kernel, vitaly.lifshits, kai.heng.feng, sasha.neftin

After commit "e1000e: disable s0ix entry and exit flows for ME systems",
some ThinkPads always failed to disable ulp by ME.
commit "e1000e: Warn if disabling ULP failed" break out of init phy:

error log:
[   42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
[   42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
[   42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error

When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
If continue to init phy like before, it can work as before.
iperf test result good too.

Chnage e_warn to e_dbg, in case it confuses.

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index f999cca37a8a..63405819eb83 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -302,8 +302,7 @@ static s32 e1000_init_phy_workarounds_pchlan(struct e1000_hw *hw)
 	hw->dev_spec.ich8lan.ulp_state = e1000_ulp_state_unknown;
 	ret_val = e1000_disable_ulp_lpt_lp(hw, true);
 	if (ret_val) {
-		e_warn("Failed to disable ULP\n");
-		goto out;
+		e_dbg("Failed to disable ULP\n");
 	}
 
 	ret_val = hw->phy.ops.acquire(hw);
-- 
2.26.2


^ permalink raw reply related

* [PATCH 05/11] bpf: Remove btf_id helpers resolving
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Now when we moved the helpers btf_id into .BTF_ids section,
we can remove the code that resolve those IDs in runtime.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/bpf/btf.c | 88 +++---------------------------------------------
 1 file changed, 4 insertions(+), 84 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 58c9af1d4808..aea7b2cc8d26 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4049,96 +4049,16 @@ int btf_struct_access(struct bpf_verifier_log *log,
 	return -EINVAL;
 }
 
-static int __btf_resolve_helper_id(struct bpf_verifier_log *log, void *fn,
-				   int arg)
-{
-	char fnname[KSYM_SYMBOL_LEN + 4] = "btf_";
-	const struct btf_param *args;
-	const struct btf_type *t;
-	const char *tname, *sym;
-	u32 btf_id, i;
-
-	if (IS_ERR(btf_vmlinux)) {
-		bpf_log(log, "btf_vmlinux is malformed\n");
-		return -EINVAL;
-	}
-
-	sym = kallsyms_lookup((long)fn, NULL, NULL, NULL, fnname + 4);
-	if (!sym) {
-		bpf_log(log, "kernel doesn't have kallsyms\n");
-		return -EFAULT;
-	}
-
-	for (i = 1; i <= btf_vmlinux->nr_types; i++) {
-		t = btf_type_by_id(btf_vmlinux, i);
-		if (BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF)
-			continue;
-		tname = __btf_name_by_offset(btf_vmlinux, t->name_off);
-		if (!strcmp(tname, fnname))
-			break;
-	}
-	if (i > btf_vmlinux->nr_types) {
-		bpf_log(log, "helper %s type is not found\n", fnname);
-		return -ENOENT;
-	}
-
-	t = btf_type_by_id(btf_vmlinux, t->type);
-	if (!btf_type_is_ptr(t))
-		return -EFAULT;
-	t = btf_type_by_id(btf_vmlinux, t->type);
-	if (!btf_type_is_func_proto(t))
-		return -EFAULT;
-
-	args = (const struct btf_param *)(t + 1);
-	if (arg >= btf_type_vlen(t)) {
-		bpf_log(log, "bpf helper %s doesn't have %d-th argument\n",
-			fnname, arg);
-		return -EINVAL;
-	}
-
-	t = btf_type_by_id(btf_vmlinux, args[arg].type);
-	if (!btf_type_is_ptr(t) || !t->type) {
-		/* anything but the pointer to struct is a helper config bug */
-		bpf_log(log, "ARG_PTR_TO_BTF is misconfigured\n");
-		return -EFAULT;
-	}
-	btf_id = t->type;
-	t = btf_type_by_id(btf_vmlinux, t->type);
-	/* skip modifiers */
-	while (btf_type_is_modifier(t)) {
-		btf_id = t->type;
-		t = btf_type_by_id(btf_vmlinux, t->type);
-	}
-	if (!btf_type_is_struct(t)) {
-		bpf_log(log, "ARG_PTR_TO_BTF is not a struct\n");
-		return -EFAULT;
-	}
-	bpf_log(log, "helper %s arg%d has btf_id %d struct %s\n", fnname + 4,
-		arg, btf_id, __btf_name_by_offset(btf_vmlinux, t->name_off));
-	return btf_id;
-}
-
 int btf_resolve_helper_id(struct bpf_verifier_log *log,
 			  const struct bpf_func_proto *fn, int arg)
 {
-	int *btf_id = &fn->btf_id[arg];
-	int ret;
-
 	if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID)
 		return -EINVAL;
 
-	ret = READ_ONCE(*btf_id);
-	if (ret)
-		return ret;
-	/* ok to race the search. The result is the same */
-	ret = __btf_resolve_helper_id(log, fn->func, arg);
-	if (!ret) {
-		/* Function argument cannot be type 'void' */
-		bpf_log(log, "BTF resolution bug\n");
-		return -EFAULT;
-	}
-	WRITE_ONCE(*btf_id, ret);
-	return ret;
+	if (WARN_ON_ONCE(!fn->btf_id))
+		return -EINVAL;
+
+	return fn->btf_id[arg];
 }
 
 static int __get_type_size(struct btf *btf, u32 btf_id,
-- 
2.25.4


^ permalink raw reply related

* [PATCH 07/11] bpf: Allow nested BTF object to be refferenced by BTF object + offset
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding btf_struct_address function that takes 2 BTF objects
and offset as arguments and checks whether object A is nested
in object B on given offset.

This function is be used when checking the helper function
PTR_TO_BTF_ID arguments. If the argument has an offset value,
the btf_struct_address will check if the final address is
the expected BTF ID.

This way we can access nested BTF objects under PTR_TO_BTF_ID
pointer type and pass them to helpers, while they still point
to valid kernel BTF objects.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h   |  3 +++
 kernel/bpf/btf.c      | 63 ++++++++++++++++++++++++++++++++++++++-----
 kernel/bpf/verifier.c | 32 ++++++++++++++--------
 3 files changed, 81 insertions(+), 17 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index b7d3b5f3dc09..e98c113a5d27 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1283,6 +1283,9 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 int btf_struct_access(struct bpf_verifier_log *log,
 		      const struct btf_type *t, int off, int size,
 		      u32 *next_btf_id);
+int btf_struct_address(struct bpf_verifier_log *log,
+		     const struct btf_type *t,
+		     u32 off, u32 id);
 int btf_resolve_helper_id(struct bpf_verifier_log *log,
 			  const struct bpf_func_proto *fn, int);
 
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 304369a4c2e2..6924180a19c4 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3829,9 +3829,22 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 	return true;
 }
 
-int btf_struct_access(struct bpf_verifier_log *log,
-		      const struct btf_type *t, int off, int size,
-		      u32 *next_btf_id)
+enum access_op {
+	ACCESS_NEXT,
+	ACCESS_EXPECT,
+};
+
+struct access_data {
+	enum access_op op;
+	union {
+		u32 *next_btf_id;
+		const struct btf_type *exp_type;
+	};
+};
+
+static int struct_access(struct bpf_verifier_log *log,
+			 const struct btf_type *t, int off, int size,
+			 struct access_data *data)
 {
 	u32 i, moff, mtrue_end, msize = 0, total_nelems = 0;
 	const struct btf_type *mtype, *elem_type = NULL;
@@ -3879,8 +3892,7 @@ int btf_struct_access(struct bpf_verifier_log *log,
 			goto error;
 
 		off = (off - moff) % elem_type->size;
-		return btf_struct_access(log, elem_type, off, size,
-					 next_btf_id);
+		return struct_access(log, elem_type, off, size, data);
 
 error:
 		bpf_log(log, "access beyond struct %s at off %u size %u\n",
@@ -4008,9 +4020,21 @@ int btf_struct_access(struct bpf_verifier_log *log,
 
 			/* adjust offset we're looking for */
 			off -= moff;
+
+			/* We are nexting into another struct,
+			 * check if we are crossing expected ID.
+			 */
+			if (data->op == ACCESS_EXPECT && !off && t == data->exp_type)
+				return 0;
 			goto again;
 		}
 
+		/* We are interested only in structs for expected ID,
+		 * bail out.
+		 */
+		if (data->op == ACCESS_EXPECT)
+			return -EINVAL;
+
 		if (btf_type_is_ptr(mtype)) {
 			const struct btf_type *stype;
 			u32 id;
@@ -4024,7 +4048,7 @@ int btf_struct_access(struct bpf_verifier_log *log,
 
 			stype = btf_type_skip_modifiers(btf_vmlinux, mtype->type, &id);
 			if (btf_type_is_struct(stype)) {
-				*next_btf_id = id;
+				*data->next_btf_id = id;
 				return PTR_TO_BTF_ID;
 			}
 		}
@@ -4048,6 +4072,33 @@ int btf_struct_access(struct bpf_verifier_log *log,
 	return -EINVAL;
 }
 
+int btf_struct_access(struct bpf_verifier_log *log,
+		      const struct btf_type *t, int off, int size,
+		      u32 *next_btf_id)
+{
+	struct access_data data = {
+		.op = ACCESS_NEXT,
+		.next_btf_id = next_btf_id,
+	};
+
+	return struct_access(log, t, off, size, &data);
+}
+
+int btf_struct_address(struct bpf_verifier_log *log,
+		       const struct btf_type *t,
+		       u32 off, u32 id)
+{
+	struct access_data data = { .op = ACCESS_EXPECT };
+	const struct btf_type *type;
+
+	type = btf_type_by_id(btf_vmlinux, id);
+	if (!type)
+		return -EINVAL;
+
+	data.exp_type = type;
+	return struct_access(log, t, off, 1, &data);
+}
+
 int btf_resolve_helper_id(struct bpf_verifier_log *log,
 			  const struct bpf_func_proto *fn, int arg)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b553e4523bd3..bee3da2cd945 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3741,6 +3741,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 {
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
 	enum bpf_reg_type expected_type, type = reg->type;
+	const struct btf_type *btf_type;
 	int err = 0;
 
 	if (arg_type == ARG_DONTCARE)
@@ -3820,17 +3821,26 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		expected_type = PTR_TO_BTF_ID;
 		if (type != expected_type)
 			goto err_type;
-		if (reg->btf_id != meta->btf_id) {
-			verbose(env, "Helper has type %s got %s in R%d\n",
-				kernel_type_name(meta->btf_id),
-				kernel_type_name(reg->btf_id), regno);
-
-			return -EACCES;
-		}
-		if (!tnum_is_const(reg->var_off) || reg->var_off.value || reg->off) {
-			verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
-				regno);
-			return -EACCES;
+		if (reg->off) {
+			btf_type = btf_type_by_id(btf_vmlinux, reg->btf_id);
+			if (btf_struct_address(&env->log, btf_type, reg->off, meta->btf_id)) {
+				verbose(env, "Helper has type %s got %s in R%d, off %d\n",
+					kernel_type_name(meta->btf_id),
+					kernel_type_name(reg->btf_id), regno, reg->off);
+				return -EACCES;
+			}
+		} else {
+			if (reg->btf_id != meta->btf_id) {
+				verbose(env, "Helper has type %s got %s in R%d\n",
+					kernel_type_name(meta->btf_id),
+					kernel_type_name(reg->btf_id), regno);
+				return -EACCES;
+			}
+			if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
+				verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
+					regno);
+				return -EACCES;
+			}
 		}
 	} else if (arg_type == ARG_PTR_TO_SPIN_LOCK) {
 		if (meta->func_id == BPF_FUNC_spin_lock) {
-- 
2.25.4


^ permalink raw reply related

* [PATCH 06/11] bpf: Do not pass enum bpf_access_type to btf_struct_access
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

There's no need for it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h   | 1 -
 kernel/bpf/btf.c      | 3 +--
 kernel/bpf/verifier.c | 2 +-
 net/ipv4/bpf_tcp_ca.c | 2 +-
 4 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f18c23dcc858..b7d3b5f3dc09 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1282,7 +1282,6 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 		    struct bpf_insn_access_aux *info);
 int btf_struct_access(struct bpf_verifier_log *log,
 		      const struct btf_type *t, int off, int size,
-		      enum bpf_access_type atype,
 		      u32 *next_btf_id);
 int btf_resolve_helper_id(struct bpf_verifier_log *log,
 			  const struct bpf_func_proto *fn, int);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index aea7b2cc8d26..304369a4c2e2 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3831,7 +3831,6 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 
 int btf_struct_access(struct bpf_verifier_log *log,
 		      const struct btf_type *t, int off, int size,
-		      enum bpf_access_type atype,
 		      u32 *next_btf_id)
 {
 	u32 i, moff, mtrue_end, msize = 0, total_nelems = 0;
@@ -3880,7 +3879,7 @@ int btf_struct_access(struct bpf_verifier_log *log,
 			goto error;
 
 		off = (off - moff) % elem_type->size;
-		return btf_struct_access(log, elem_type, off, size, atype,
+		return btf_struct_access(log, elem_type, off, size,
 					 next_btf_id);
 
 error:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5c7bbaac81ef..b553e4523bd3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3175,7 +3175,7 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
 			return -EACCES;
 		}
 
-		ret = btf_struct_access(&env->log, t, off, size, atype,
+		ret = btf_struct_access(&env->log, t, off, size,
 					&btf_id);
 	}
 
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index e3939f76b024..c6aab9389ac4 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -130,7 +130,7 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log,
 	size_t end;
 
 	if (atype == BPF_READ)
-		return btf_struct_access(log, t, off, size, atype, next_btf_id);
+		return btf_struct_access(log, t, off, size, next_btf_id);
 
 	if (t != tcp_sock_type) {
 		bpf_log(log, "only read is supported\n");
-- 
2.25.4


^ permalink raw reply related

* [PATCH 08/11] bpf: Add BTF whitelist support
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding support to define 'whitelist' of BTF IDs, which is
also sorted.

Following defines sorted list of BTF IDs that is accessible
within kernel code as btf_whitelist_d_path and its count is
in btf_whitelist_d_path_cnt variable.

  extern int btf_whitelist_d_path[];
  extern int btf_whitelist_d_path_cnt;

  BTF_WHITELIST_ENTRY(btf_whitelist_d_path)
  BTF_ID(func, vfs_truncate)
  BTF_ID(func, vfs_fallocate)
  BTF_ID(func, dentry_open)
  BTF_ID(func, vfs_getattr)
  BTF_ID(func, filp_close)
  BTF_WHITELIST_END(btf_whitelist_d_path)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h   |  3 +++
 kernel/bpf/btf.c      | 13 +++++++++++++
 kernel/bpf/btf_ids.h  | 38 ++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c |  5 +++++
 4 files changed, 59 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index e98c113a5d27..a94e85c2ec50 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -283,6 +283,7 @@ struct bpf_func_proto {
 		enum bpf_arg_type arg_type[5];
 	};
 	int *btf_id; /* BTF ids of arguments */
+	bool (*allowed)(const struct bpf_prog *prog);
 };
 
 /* bpf_context is intentionally undefined structure. Pointer to bpf_context is
@@ -1745,6 +1746,8 @@ enum bpf_text_poke_type {
 int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 		       void *addr1, void *addr2);
 
+bool btf_whitelist_search(int id, int list[], int cnt);
+
 extern int bpf_skb_output_btf_ids[];
 extern int bpf_seq_printf_btf_ids[];
 extern int bpf_seq_write_btf_ids[];
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6924180a19c4..feda74d232c5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -20,6 +20,7 @@
 #include <linux/btf.h>
 #include <linux/skmsg.h>
 #include <linux/perf_event.h>
+#include <linux/bsearch.h>
 #include <net/sock.h>
 
 /* BTF (BPF Type Format) is the meta data format which describes
@@ -4669,3 +4670,15 @@ u32 btf_id(const struct btf *btf)
 {
 	return btf->id;
 }
+
+static int btf_id_cmp_func(const void *a, const void *b)
+{
+	const int *pa = a, *pb = b;
+
+	return *pa - *pb;
+}
+
+bool btf_whitelist_search(int id, int list[], int cnt)
+{
+	return bsearch(&id, list, cnt, sizeof(int), btf_id_cmp_func) != NULL;
+}
diff --git a/kernel/bpf/btf_ids.h b/kernel/bpf/btf_ids.h
index 68aa5c38a37f..a90c09faa515 100644
--- a/kernel/bpf/btf_ids.h
+++ b/kernel/bpf/btf_ids.h
@@ -67,4 +67,42 @@ asm(							\
 #name ":;                                      \n"	\
 ".popsection;                                  \n");
 
+
+/*
+ * The BTF_WHITELIST_ENTRY/END macros pair defines sorted
+ * list of BTF IDs plus its members count, with following
+ * layout:
+ *
+ * BTF_WHITELIST_ENTRY(list2)
+ * BTF_ID(type1, name1)
+ * BTF_ID(type2, name2)
+ * BTF_WHITELIST_END(list)
+ *
+ * __BTF_ID__sort__list:
+ * list2_cnt:
+ * .zero 4
+ * list2:
+ * __BTF_ID__type1__name1__3:
+ * .zero 4
+ * __BTF_ID__type2__name2__4:
+ * .zero 4
+ *
+ */
+#define BTF_WHITELIST_ENTRY(name)			\
+asm(							\
+".pushsection " SECTION ",\"a\";               \n"	\
+".global __BTF_ID__sort__" #name ";            \n"	\
+"__BTF_ID__sort__" #name ":;                   \n"	\
+".global " #name "_cnt;                        \n"	\
+#name "_cnt:;                                  \n"	\
+".zero 4                                       \n"	\
+".popsection;                                  \n");	\
+BTF_ID_LIST(name)
+
+#define BTF_WHITELIST_END(name)				\
+asm(							\
+".pushsection " SECTION ",\"a\";              \n"	\
+".size __BTF_ID__sort__" #name ", .-" #name " \n"	\
+".popsection;                                 \n");
+
 #endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index bee3da2cd945..5a9a6fd72907 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4633,6 +4633,11 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		return -EINVAL;
 	}
 
+	if (fn->allowed && !fn->allowed(env->prog)) {
+		verbose(env, "helper call is not allowed in probe\n");
+		return -EINVAL;
+	}
+
 	/* With LD_ABS/IND some JITs save/restore skb from r1. */
 	changes_data = bpf_helper_changes_pkt_data(fn->func);
 	if (changes_data && fn->arg1_type != ARG_PTR_TO_CTX) {
-- 
2.25.4


^ permalink raw reply related

* [PATCH 10/11] selftests/bpf: Add verifier test for d_path helper
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding verifier test for attaching tracing program and
calling d_path helper from within and testing that it's
allowed for dentry_open function and denied for 'd_path'
function with appropriate error.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/testing/selftests/bpf/test_verifier.c   | 13 ++++++-
 tools/testing/selftests/bpf/verifier/d_path.c | 38 +++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/d_path.c

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 78a6bae56ea6..3cce3dc766a2 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -114,6 +114,7 @@ struct bpf_test {
 		bpf_testdata_struct_t retvals[MAX_TEST_RUNS];
 	};
 	enum bpf_attach_type expected_attach_type;
+	const char *kfunc;
 };
 
 /* Note we want this to be 64 bit aligned so that the end of our array is
@@ -984,8 +985,18 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 		attr.log_level = 4;
 	attr.prog_flags = pflags;
 
+	if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
+		attr.attach_btf_id = libbpf_find_vmlinux_btf_id(test->kfunc,
+						attr.expected_attach_type);
+	}
+
 	fd_prog = bpf_load_program_xattr(&attr, bpf_vlog, sizeof(bpf_vlog));
-	if (fd_prog < 0 && !bpf_probe_prog_type(prog_type, 0)) {
+
+	/* BPF_PROG_TYPE_TRACING requires more setup and
+	 * bpf_probe_prog_type won't give correct answer
+	 */
+	if (fd_prog < 0 && (prog_type != BPF_PROG_TYPE_TRACING) &&
+	    !bpf_probe_prog_type(prog_type, 0)) {
 		printf("SKIP (unsupported program type %d)\n", prog_type);
 		skips++;
 		goto close_fds;
diff --git a/tools/testing/selftests/bpf/verifier/d_path.c b/tools/testing/selftests/bpf/verifier/d_path.c
new file mode 100644
index 000000000000..e08181abc056
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/d_path.c
@@ -0,0 +1,38 @@
+{
+	"d_path accept",
+	.insns = {
+	BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
+	BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+	BPF_MOV64_IMM(BPF_REG_6, 0),
+	BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6, 0),
+	BPF_LD_IMM64(BPF_REG_3, 8),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_d_path),
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.errstr = "R0 max value is outside of the array range",
+	.result = ACCEPT,
+	.prog_type = BPF_PROG_TYPE_TRACING,
+	.expected_attach_type = BPF_TRACE_FENTRY,
+	.kfunc = "dentry_open",
+},
+{
+	"d_path reject",
+	.insns = {
+	BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
+	BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+	BPF_MOV64_IMM(BPF_REG_6, 0),
+	BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6, 0),
+	BPF_LD_IMM64(BPF_REG_3, 8),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_d_path),
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.errstr = "helper call is not allowed in probe",
+	.result = REJECT,
+	.prog_type = BPF_PROG_TYPE_TRACING,
+	.expected_attach_type = BPF_TRACE_FENTRY,
+	.kfunc = "d_path",
+},
-- 
2.25.4


^ permalink raw reply related

* [PATCH 09/11] bpf: Add d_path helper
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, Wenbo Zhang, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding d_path helper function that returns full path
for give 'struct path' object, which needs to be the
kernel BTF 'path' object.

The helper calls directly d_path function.

Updating also bpf.h tools uapi header and adding
'path' to bpf_helpers_doc.py script.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h            |  4 ++++
 include/uapi/linux/bpf.h       | 14 ++++++++++++-
 kernel/bpf/btf_ids.c           | 11 ++++++++++
 kernel/trace/bpf_trace.c       | 38 ++++++++++++++++++++++++++++++++++
 scripts/bpf_helpers_doc.py     |  2 ++
 tools/include/uapi/linux/bpf.h | 14 ++++++++++++-
 6 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a94e85c2ec50..d35265b6c574 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1752,5 +1752,9 @@ extern int bpf_skb_output_btf_ids[];
 extern int bpf_seq_printf_btf_ids[];
 extern int bpf_seq_write_btf_ids[];
 extern int bpf_xdp_output_btf_ids[];
+extern int bpf_d_path_btf_ids[];
+
+extern int btf_whitelist_d_path[];
+extern int btf_whitelist_d_path_cnt;
 
 #endif /* _LINUX_BPF_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c65b374a5090..e308746b9344 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3252,6 +3252,17 @@ union bpf_attr {
  * 		case of **BPF_CSUM_LEVEL_QUERY**, the current skb->csum_level
  * 		is returned or the error code -EACCES in case the skb is not
  * 		subject to CHECKSUM_UNNECESSARY.
+ *
+ * int bpf_d_path(struct path *path, char *buf, u32 sz)
+ *	Description
+ *		Return full path for given 'struct path' object, which
+ *		needs to be the kernel BTF 'path' object. The path is
+ *		returned in buffer provided 'buf' of size 'sz'.
+ *
+ *	Return
+ *		length of returned string on success, or a negative
+ *		error in case of failure
+ *
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3389,7 +3400,8 @@ union bpf_attr {
 	FN(ringbuf_submit),		\
 	FN(ringbuf_discard),		\
 	FN(ringbuf_query),		\
-	FN(csum_level),
+	FN(csum_level),			\
+	FN(d_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/bpf/btf_ids.c b/kernel/bpf/btf_ids.c
index d8d0df162f04..853c8fd59b06 100644
--- a/kernel/bpf/btf_ids.c
+++ b/kernel/bpf/btf_ids.c
@@ -13,3 +13,14 @@ BTF_ID(struct, seq_file)
 
 BTF_ID_LIST(bpf_xdp_output_btf_ids)
 BTF_ID(struct, xdp_buff)
+
+BTF_ID_LIST(bpf_d_path_btf_ids)
+BTF_ID(struct, path)
+
+BTF_WHITELIST_ENTRY(btf_whitelist_d_path)
+BTF_ID(func, vfs_truncate)
+BTF_ID(func, vfs_fallocate)
+BTF_ID(func, dentry_open)
+BTF_ID(func, vfs_getattr)
+BTF_ID(func, filp_close)
+BTF_WHITELIST_END(btf_whitelist_d_path)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index c1866d76041f..0ff5d8434d40 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1016,6 +1016,42 @@ static const struct bpf_func_proto bpf_send_signal_thread_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_d_path, struct path *, path, char *, buf, u32, sz)
+{
+	char *p = d_path(path, buf, sz - 1);
+	int len;
+
+	if (IS_ERR(p)) {
+		len = PTR_ERR(p);
+	} else {
+		len = strlen(p);
+		if (len && p != buf) {
+			memmove(buf, p, len);
+			buf[len] = 0;
+		}
+	}
+
+	return len;
+}
+
+static bool bpf_d_path_allowed(const struct bpf_prog *prog)
+{
+	return btf_whitelist_search(prog->aux->attach_btf_id,
+				    btf_whitelist_d_path,
+				    btf_whitelist_d_path_cnt);
+}
+
+static const struct bpf_func_proto bpf_d_path_proto = {
+	.func		= bpf_d_path,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_BTF_ID,
+	.arg2_type	= ARG_PTR_TO_MEM,
+	.arg3_type	= ARG_CONST_SIZE,
+	.btf_id		= bpf_d_path_btf_ids,
+	.allowed	= bpf_d_path_allowed,
+};
+
 const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1483,6 +1519,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->expected_attach_type == BPF_TRACE_ITER ?
 		       &bpf_seq_write_proto :
 		       NULL;
+	case BPF_FUNC_d_path:
+		return &bpf_d_path_proto;
 	default:
 		return raw_tp_prog_func_proto(func_id, prog);
 	}
diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py
index 91fa668fa860..3161bf4ccee4 100755
--- a/scripts/bpf_helpers_doc.py
+++ b/scripts/bpf_helpers_doc.py
@@ -425,6 +425,7 @@ class PrinterHelpers(Printer):
             'struct __sk_buff',
             'struct sk_msg_md',
             'struct xdp_md',
+            'struct path',
     ]
     known_types = {
             '...',
@@ -458,6 +459,7 @@ class PrinterHelpers(Printer):
             'struct sockaddr',
             'struct tcphdr',
             'struct seq_file',
+            'struct path',
     }
     mapped_types = {
             'u8': '__u8',
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c65b374a5090..e308746b9344 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3252,6 +3252,17 @@ union bpf_attr {
  * 		case of **BPF_CSUM_LEVEL_QUERY**, the current skb->csum_level
  * 		is returned or the error code -EACCES in case the skb is not
  * 		subject to CHECKSUM_UNNECESSARY.
+ *
+ * int bpf_d_path(struct path *path, char *buf, u32 sz)
+ *	Description
+ *		Return full path for given 'struct path' object, which
+ *		needs to be the kernel BTF 'path' object. The path is
+ *		returned in buffer provided 'buf' of size 'sz'.
+ *
+ *	Return
+ *		length of returned string on success, or a negative
+ *		error in case of failure
+ *
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3389,7 +3400,8 @@ union bpf_attr {
 	FN(ringbuf_submit),		\
 	FN(ringbuf_discard),		\
 	FN(ringbuf_query),		\
-	FN(csum_level),
+	FN(csum_level),			\
+	FN(d_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.25.4


^ permalink raw reply related

* [PATCH 11/11] selftests/bpf: Add test for d_path helper
From: Jiri Olsa @ 2020-06-16 10:05 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Wenbo Zhang, netdev, bpf, Song Liu, Yonghong Song,
	Martin KaFai Lau, David Miller, John Fastabend, KP Singh,
	Andrii Nakryiko, Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <20200616100512.2168860-1-jolsa@kernel.org>

Adding test for d_path helper which is pretty much
copied from Wenbo Zhang's test for bpf_get_fd_path,
which never made it in.

I've failed so far to compile the test with <linux/fs.h>
kernel header, so for now adding 'struct file' with f_path
member that has same offset as kernel's file object.

Original-patch-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/d_path.c | 153 ++++++++++++++++++
 .../testing/selftests/bpf/progs/test_d_path.c |  55 +++++++
 2 files changed, 208 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/d_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_d_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/d_path.c b/tools/testing/selftests/bpf/prog_tests/d_path.c
new file mode 100644
index 000000000000..e2b7dfeb506f
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/d_path.c
@@ -0,0 +1,153 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FILES		7
+#define MAX_EVENT_NUM		16
+
+struct d_path_test_data {
+	pid_t pid;
+	__u32 cnt_stat;
+	__u32 cnt_close;
+	char paths_stat[MAX_EVENT_NUM][MAX_PATH_LEN];
+	char paths_close[MAX_EVENT_NUM][MAX_PATH_LEN];
+};
+
+#include "test_d_path.skel.h"
+
+static struct {
+	__u32 cnt;
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src;
+
+static int set_pathname(int fd, pid_t pid)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", pid, fd);
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	int pipefd[2] = { -1, -1 };
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/d_path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_d_path will return path with (deleted) */
+	remove("/tmp/d_path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	ret = set_pathname(pipefd[0], pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1], pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd, pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd, pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd, pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd, pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd, pid);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	/* triggers vfs_getattr */
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	/* triggers filp_close */
+	close(pipefd[0]);
+	close(pipefd[1]);
+	close(sockfd);
+	close(procfd);
+	close(devfd);
+	close(localfd);
+	close(indicatorfd);
+	return ret;
+}
+
+void test_d_path(void)
+{
+	struct test_d_path *skel;
+	struct d_path_test_data *dst;
+	__u32 duration = 0;
+	int err;
+
+	skel = test_d_path__open_and_load();
+	if (CHECK(!skel, "test_d_path_load", "d_path skeleton failed\n"))
+		goto cleanup;
+
+	err = test_d_path__attach(skel);
+	if (CHECK(err, "modify_return", "attach failed: %d\n", err))
+		goto cleanup;
+
+	dst = &skel->bss->data;
+	dst->pid = getpid();
+
+	err = trigger_fstat_events(skel->bss->data.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FILES; i++) {
+		if (i < 3) {
+			CHECK((dst->paths_stat[i][0] == 0), "d_path",
+			      "failed to filter fs [%d]: %s vs %s\n",
+			      i, src.paths[i], dst->paths_stat[i]);
+			CHECK((dst->paths_close[i][0] == 0), "d_path",
+			      "failed to filter fs [%d]: %s vs %s\n",
+			      i, src.paths[i], dst->paths_close[i]);
+		} else {
+			CHECK(strncmp(src.paths[i], dst->paths_stat[i], MAX_PATH_LEN),
+			      "d_path",
+			      "failed to get stat path[%d]: %s vs %s\n",
+			      i, src.paths[i], dst->paths_stat[i]);
+			CHECK(strncmp(src.paths[i], dst->paths_close[i], MAX_PATH_LEN),
+			      "d_path",
+			      "failed to get close path[%d]: %s vs %s\n",
+			      i, src.paths[i], dst->paths_close[i]);
+		}
+	}
+
+cleanup:
+	test_d_path__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_d_path.c b/tools/testing/selftests/bpf/progs/test_d_path.c
new file mode 100644
index 000000000000..1b478c00ee7a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_d_path.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct d_path_test_data {
+	pid_t pid;
+	__u32 cnt_stat;
+	__u32 cnt_close;
+	char paths_stat[MAX_EVENT_NUM][MAX_PATH_LEN];
+	char paths_close[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct path;
+struct kstat;
+
+SEC("fentry/vfs_getattr")
+int BPF_PROG(prog_stat, struct path *path, struct kstat *stat,
+	     __u32 request_mask, unsigned int query_flags)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+
+	if (data.cnt_stat >= MAX_EVENT_NUM)
+		return 0;
+
+	bpf_d_path(path, data.paths_stat[data.cnt_stat], MAX_PATH_LEN);
+	data.cnt_stat++;
+	return 0;
+}
+
+SEC("fentry/filp_close")
+int BPF_PROG(prog_close, struct file *file, void *id)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+
+	if (data.cnt_close >= MAX_EVENT_NUM)
+		return 0;
+
+	bpf_d_path((struct path *) &file->f_path,
+		   data.paths_close[data.cnt_close], MAX_PATH_LEN);
+	data.cnt_close++;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.25.4


^ permalink raw reply related

* Re: [PATCHv4 bpf-next 1/2] xdp: add a new helper for dev map multicast support
From: Hangbin Liu @ 2020-06-16 10:11 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: bpf, netdev, Toke Høiland-Jørgensen, Jiri Benc,
	Eelco Chaudron, ast, Daniel Borkmann, Lorenzo Bianconi
In-Reply-To: <20200616105506.163ea5a3@carbon>

HI Jesper,

On Tue, Jun 16, 2020 at 10:55:06AM +0200, Jesper Dangaard Brouer wrote:
> > Is there anything else I should do except add the following line?
> > 	nxdpf->mem.type = MEM_TYPE_PAGE_ORDER0;
> 
> You do realize that you also have copied over the mem.id, right?

Thanks for the reminding. To confirm, set mem.id to 0 is enough, right?
> 
> And as I wrote below you also need to update frame_sz.
> 
> > > 
> > > You also need to update xdpf->frame_sz, as you also cannot assume it is
> > > the same.  
> > 
> > Won't the memcpy() copy xdpf->frame_sz to nxdpf? 
> 
> You obviously cannot use the frame_sz from the existing frame, as you
> just allocated a new page for the new xdp_frame, that have another size
> (here PAGE_SIZE).

Thanks, I didn't understand the frame_sz correctly before.
> 
> 
> > And I didn't see xdpf->frame_sz is set in xdp_convert_zc_to_xdp_frame(),
> > do we need a fix?
> 
> Good catch, that sounds like a bug, that should be fixed.
> Will you send a fix?

OK, I will.

> 
> 
> > > > +
> > > > +	nxdpf = addr;
> > > > +	nxdpf->data = addr + headroom;
> > > > +
> > > > +	return nxdpf;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(xdpf_clone);  
> > > 
> > > 
> > > struct xdp_frame {
> > > 	void *data;
> > > 	u16 len;
> > > 	u16 headroom;
> > > 	u32 metasize:8;
> > > 	u32 frame_sz:24;
> > > 	/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
> > > 	 * while mem info is valid on remote CPU.
> > > 	 */
> > > 	struct xdp_mem_info mem;
> > > 	struct net_device *dev_rx; /* used by cpumap */
> > > };
> > >   
> > 
> 
> struct xdp_mem_info {
> 	u32                        type;                 /*     0     4 */
> 	u32                        id;                   /*     4     4 */
> 
> 	/* size: 8, cachelines: 1, members: 2 */
> 	/* last cacheline: 8 bytes */
> };
> 

Is this a struct reference or you want to remind me something else?

Thanks
Hangbin

^ permalink raw reply

* Re: [PATCH net v2 2/2] net/sched: act_gate: fix configuration of the periodic timer
From: Davide Caratti @ 2020-06-16 10:12 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Po Liu, Cong Wang, David S . Miller, netdev
In-Reply-To: <CA+h21ho1x1-N+HyFXcy+pqdWcQioFWgRs0C+1h+kn6w8zHVUwQ@mail.gmail.com>

hello Vladimir,

thanks a lot for reviewing this.

On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:

[...]

> > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > index 6775ccf355b0..3c529a4bcca5 100644
> > --- a/net/sched/act_gate.c
> > +++ b/net/sched/act_gate.c
> > @@ -272,6 +272,27 @@ static int parse_gate_list(struct nlattr *list_attr,
> >         return err;
> >  }
> > 
> > +static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> > +                            enum tk_offsets tko, s32 clockid,
> > +                            bool do_init)
> > +{
> > +       if (!do_init) {
> > +               if (basetime == gact->param.tcfg_basetime &&
> > +                   tko == gact->tk_offset &&
> > +                   clockid == gact->param.tcfg_clockid)
> > +                       return;
> > +
> > +               spin_unlock_bh(&gact->tcf_lock);
> > +               hrtimer_cancel(&gact->hitimer);
> > +               spin_lock_bh(&gact->tcf_lock);
> 
> I think it's horrible to do this just to get out of atomic context.
> What if you split the "replace" functionality of gate_setup_timer into
> a separate gate_cancel_timer function, which you could call earlier
> (before taking the spin lock)? 

I think it would introduce the following 2 problems:

problem #1) a race condition, see below:

> That change would look like this:
> diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> index 3c529a4bcca5..47c625a0e70c 100644
> --- a/net/sched/act_gate.c
> +++ b/net/sched/act_gate.c
> @@ -273,19 +273,8 @@ static int parse_gate_list(struct nlattr *list_attr,
>  }
> 
>  static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> -                 enum tk_offsets tko, s32 clockid,
> -                 bool do_init)
> +                 enum tk_offsets tko, s32 clockid)
>  {
> -    if (!do_init) {
> -        if (basetime == gact->param.tcfg_basetime &&
> -            tko == gact->tk_offset &&
> -            clockid == gact->param.tcfg_clockid)
> -            return;
> -
> -        spin_unlock_bh(&gact->tcf_lock);
> -        hrtimer_cancel(&gact->hitimer);
> -        spin_lock_bh(&gact->tcf_lock);
> -    }
>      gact->param.tcfg_basetime = basetime;
>      gact->param.tcfg_clockid = clockid;
>      gact->tk_offset = tko;
> @@ -293,6 +282,17 @@ static void gate_setup_timer(struct tcf_gate
> *gact, u64 basetime,
>      gact->hitimer.function = gate_timer_func;
>  }
> 
> +static void gate_cancel_timer(struct tcf_gate *gact, u64 basetime,
> +                  enum tk_offsets tko, s32 clockid)
> +{
> +    if (basetime == gact->param.tcfg_basetime &&
> +        tko == gact->tk_offset &&
> +        clockid == gact->param.tcfg_clockid)
> +        return;
> +
> +    hrtimer_cancel(&gact->hitimer);
> +}
> +

the above function either cancels a timer, or does nothing: it depends on
the value of the 3-ple {tcfg_basetime, tk_offset, tcfg_clockid}. If we run
this function without holding tcf_lock, nobody will guarantee that
{tcfg_basetime, tk_offset, tcfg_clockid} is not being concurrently
rewritten by some other command like:

# tc action replace action gate <parameters> index <x>

>  static int tcf_gate_init(struct net *net, struct nlattr *nla,
>               struct nlattr *est, struct tc_action **a,
>               int ovr, int bind, bool rtnl_held,
> @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct
> nlattr *nla,
>      gact = to_gate(*a);
>      if (ret == ACT_P_CREATED)
>          INIT_LIST_HEAD(&gact->param.entries);
> +    else
> +        gate_cancel_timer(gact, basetime, tk_offset, clockid);
> 

IOW, the above line is racy unless we do spin_lock()/spin_unlock() around
the

if (<expression depending on gact-> members>)
	return; 

statement before hrtimer_cancel(), which does not seem much different
than what I did in gate_setup_timer().

[...]

> @@ -433,6 +448,11 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> >         if (goto_ch)
> >                 tcf_chain_put_by_act(goto_ch);
> >  release_idr:
> > +       /* action is not in: hitimer can be inited without taking tcf_lock */
> > +       if (ret == ACT_P_CREATED)
> > +               gate_setup_timer(gact, gact->param.tcfg_basetime,
> > +                                gact->tk_offset, gact->param.tcfg_clockid,
> > +                                true);

please note, here I felt the need to add a comment, because when ret ==
ACT_P_CREATED the action is not inserted in any list, so there is no
concurrent writer of gact-> members for that action.

> >         tcf_idr_release(*a, bind);
> >         return err;
> >  }

problem #2) a functional issue that originates in how 'cycle_time' and
'entries' are validated (*). See below:

On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:

> static int tcf_gate_init(struct net *net, struct nlattr *nla,
>               struct nlattr *est, struct tc_action **a,
>               int ovr, int bind, bool rtnl_held,
> @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
>      gact = to_gate(*a);
>      if (ret == ACT_P_CREATED)
>          INIT_LIST_HEAD(&gact->param.entries);
> +    else
> +        gate_cancel_timer(gact, basetime, tk_offset, clockid);

here you propose to cancel the timer, but few lines after we have this:

385         err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack);
386         if (err < 0)
387                 goto release_idr;
388 

so, when users try the following commands:

# tc action add action gate <good parameters> index 2
# tc action replace action gate <other good parameters> goto chain 42 index 2

and chain 42 does not exist, the second command will fail. But the timer
is erroneously stopped, and never started again. So, the first rule is
correctly inserted but it becomes no more functional after users try to
replace it with another one having invalid control action.

Moving the call to gate_cancel_timer() after the validation of the control
action will not fix this problem, because 'cycle_time' and 'entries' are
validated together, and with the spinlock taken. Because of this, we need
to cancel that timer only when we know that we will not do
tcf_idr_release() and return some error to the user.

please let me know if you think my doubts are not well-founded.

-- 
davide

(*) now that I see parse_gate_list() again, I noticed another potential
issue with replace (that I need to verify first): apparently the list is
not replaced, it's just "updated" with new entries appended at the end. I
will try to write a fix for that (separate from this series).



^ permalink raw reply

* Re: [PATCH v3 1/3] net: phy: mscc: move shared probe code into a helper
From: Russell King - ARM Linux admin @ 2020-06-16 10:13 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: David Miller, kuba, robh+dt, andrew, f.fainelli, hkallweit1,
	netdev, devicetree, linux-kernel, christoph.muellner
In-Reply-To: <1656001.WqWBulSbu3@diego>

On Tue, Jun 16, 2020 at 11:10:27AM +0200, Heiko Stübner wrote:
> > 
> > You also need to provide a proper header posting when you repost this series
> > after fixing this bug.
> 
> not sure I understand what you mean with "header posting" here.

David is requesting that you send a "0/N" email summarising the purpose
of the patch series and any other relevant information to the series as
a whole.  The subsequent patches should be threaded to the 0/N email.

The 0/N email should also contain the overall diffstat for the series.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH] e1000e: continue to init phy even when failed to disable ULP
From: Paul Menzel @ 2020-06-16 10:20 UTC (permalink / raw)
  To: Aaron Ma, jeffrey.t.kirsher, davem, kuba, intel-wired-lan, netdev,
	linux-kernel, vitaly.lifshits, kai.heng.feng, sasha.neftin
In-Reply-To: <20200616100512.22512-1-aaron.ma@canonical.com>

Dear Aaron,


Thank you for your patch.

(Rant: Some more fallout from the other patch, which nobody reverted.)

Am 16.06.20 um 12:05 schrieb Aaron Ma:
> After commit "e1000e: disable s0ix entry and exit flows for ME systems",
> some ThinkPads always failed to disable ulp by ME.

Please add the (short) commit hash from the master branch.

s/ulp/ULP/

Please list one ThinkPad as example.

> commit "e1000e: Warn if disabling ULP failed" break out of init phy:

1.  Please add the closing quote ".
2.  Please add the commit hash.

> error log:
> [   42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
> [   42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
> [   42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error
> 
> When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
> If continue to init phy like before, it can work as before.
> iperf test result good too.
> 
> Chnage e_warn to e_dbg, in case it confuses.

s/Chnage/Change/

Please leave the level warning, and improve the warning message instead, 
so a user knows what is going on.

Could you please add a `Fixes:` tag and the URL to the bug report?

> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
> ---
>   drivers/net/ethernet/intel/e1000e/ich8lan.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
> index f999cca37a8a..63405819eb83 100644
> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
> @@ -302,8 +302,7 @@ static s32 e1000_init_phy_workarounds_pchlan(struct e1000_hw *hw)
>   	hw->dev_spec.ich8lan.ulp_state = e1000_ulp_state_unknown;
>   	ret_val = e1000_disable_ulp_lpt_lp(hw, true);
>   	if (ret_val) {
> -		e_warn("Failed to disable ULP\n");
> -		goto out;
> +		e_dbg("Failed to disable ULP\n");
>   	}
>   
>   	ret_val = hw->phy.ops.acquire(hw);
> 

Kind regards,

Paul

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH] e1000e: continue to init phy even when failed to disable ULP
From: Aaron Ma @ 2020-06-16 10:31 UTC (permalink / raw)
  To: Paul Menzel, jeffrey.t.kirsher, davem, kuba, intel-wired-lan,
	netdev, linux-kernel, vitaly.lifshits, kai.heng.feng,
	sasha.neftin
In-Reply-To: <74391e62-7226-b0f8-d129-768b88f13160@molgen.mpg.de>

On 6/16/20 6:20 PM, Paul Menzel wrote:
> Dear Aaron,
> 
> 
> Thank you for your patch.
> 
> (Rant: Some more fallout from the other patch, which nobody reverted.)
> 

Would you like a revert?

Thanks,
Aaron

> Am 16.06.20 um 12:05 schrieb Aaron Ma:
>> After commit "e1000e: disable s0ix entry and exit flows for ME systems",
>> some ThinkPads always failed to disable ulp by ME.
> 
> Please add the (short) commit hash from the master branch.
> 
> s/ulp/ULP/
> 
> Please list one ThinkPad as example.
> 
>> commit "e1000e: Warn if disabling ULP failed" break out of init phy:
> 
> 1.  Please add the closing quote ".
> 2.  Please add the commit hash.
> 
>> error log:
>> [   42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
>> [   42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
>> [   42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error
>>
>> When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
>> If continue to init phy like before, it can work as before.
>> iperf test result good too.
>>
>> Chnage e_warn to e_dbg, in case it confuses.
> 
> s/Chnage/Change/
> 
> Please leave the level warning, and improve the warning message instead, so a user knows what is going on.
> 
> Could you please add a `Fixes:` tag and the URL to the bug report?
> 
>> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
>> ---
>>   drivers/net/ethernet/intel/e1000e/ich8lan.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>> index f999cca37a8a..63405819eb83 100644
>> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>> @@ -302,8 +302,7 @@ static s32 e1000_init_phy_workarounds_pchlan(struct e1000_hw *hw)
>>       hw->dev_spec.ich8lan.ulp_state = e1000_ulp_state_unknown;
>>       ret_val = e1000_disable_ulp_lpt_lp(hw, true);
>>       if (ret_val) {
>> -        e_warn("Failed to disable ULP\n");
>> -        goto out;
>> +        e_dbg("Failed to disable ULP\n");
>>       }
>>         ret_val = hw->phy.ops.acquire(hw);
>>
> 
> Kind regards,
> 
> Paul

^ permalink raw reply

* [PATCH bpf] xdp: handle frame_sz in xdp_convert_zc_to_xdp_frame()
From: Hangbin Liu @ 2020-06-16 10:35 UTC (permalink / raw)
  To: netdev
  Cc: Jesper Dangaard Brouer, bpf, Toke Høiland-Jørgensen,
	Daniel Borkmann, Alexei Starovoitov, Hangbin Liu

In commit 34cc0b338a61 we only handled the frame_sz in convert_to_xdp_frame().
This patch will also handle frame_sz in xdp_convert_zc_to_xdp_frame().

Fixes: 34cc0b338a61 ("xdp: Xdp_frame add member frame_sz and handle in convert_to_xdp_frame")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/core/xdp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 90f44f382115..3c45f99e26d5 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -462,6 +462,7 @@ struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp)
 	xdpf->len = totsize - metasize;
 	xdpf->headroom = 0;
 	xdpf->metasize = metasize;
+	xdpf->frame_sz = PAGE_SIZE;
 	xdpf->mem.type = MEM_TYPE_PAGE_ORDER0;
 
 	xsk_buff_free(xdp);
-- 
2.25.4


^ permalink raw reply related

* Re: [PATCH net v2 2/2] net/sched: act_gate: fix configuration of the periodic timer
From: Vladimir Oltean @ 2020-06-16 10:38 UTC (permalink / raw)
  To: Davide Caratti; +Cc: Po Liu, Cong Wang, David S . Miller, netdev
In-Reply-To: <fd20899c60d96695060ecb782421133829f09bc2.camel@redhat.com>

Hi Davide,

On Tue, 16 Jun 2020 at 13:13, Davide Caratti <dcaratti@redhat.com> wrote:
>
> hello Vladimir,
>
> thanks a lot for reviewing this.
>
> On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
>
> [...]
>
> > > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > > index 6775ccf355b0..3c529a4bcca5 100644
> > > --- a/net/sched/act_gate.c
> > > +++ b/net/sched/act_gate.c
> > > @@ -272,6 +272,27 @@ static int parse_gate_list(struct nlattr *list_attr,
> > >         return err;
> > >  }
> > >
> > > +static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> > > +                            enum tk_offsets tko, s32 clockid,
> > > +                            bool do_init)
> > > +{
> > > +       if (!do_init) {
> > > +               if (basetime == gact->param.tcfg_basetime &&
> > > +                   tko == gact->tk_offset &&
> > > +                   clockid == gact->param.tcfg_clockid)
> > > +                       return;
> > > +
> > > +               spin_unlock_bh(&gact->tcf_lock);
> > > +               hrtimer_cancel(&gact->hitimer);
> > > +               spin_lock_bh(&gact->tcf_lock);
> >
> > I think it's horrible to do this just to get out of atomic context.
> > What if you split the "replace" functionality of gate_setup_timer into
> > a separate gate_cancel_timer function, which you could call earlier
> > (before taking the spin lock)?
>
> I think it would introduce the following 2 problems:
>
> problem #1) a race condition, see below:
>

I must have been living under a stone since I missed the entire
unlocked tc filter rework done by Vlad Buslov, I thought that
tcf_action_init always runs under rtnl. So it is clear now.

> > That change would look like this:
> > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > index 3c529a4bcca5..47c625a0e70c 100644
> > --- a/net/sched/act_gate.c
> > +++ b/net/sched/act_gate.c
> > @@ -273,19 +273,8 @@ static int parse_gate_list(struct nlattr *list_attr,
> >  }
> >
> >  static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> > -                 enum tk_offsets tko, s32 clockid,
> > -                 bool do_init)
> > +                 enum tk_offsets tko, s32 clockid)
> >  {
> > -    if (!do_init) {
> > -        if (basetime == gact->param.tcfg_basetime &&
> > -            tko == gact->tk_offset &&
> > -            clockid == gact->param.tcfg_clockid)
> > -            return;
> > -
> > -        spin_unlock_bh(&gact->tcf_lock);
> > -        hrtimer_cancel(&gact->hitimer);
> > -        spin_lock_bh(&gact->tcf_lock);
> > -    }
> >      gact->param.tcfg_basetime = basetime;
> >      gact->param.tcfg_clockid = clockid;
> >      gact->tk_offset = tko;
> > @@ -293,6 +282,17 @@ static void gate_setup_timer(struct tcf_gate
> > *gact, u64 basetime,
> >      gact->hitimer.function = gate_timer_func;
> >  }
> >
> > +static void gate_cancel_timer(struct tcf_gate *gact, u64 basetime,
> > +                  enum tk_offsets tko, s32 clockid)
> > +{
> > +    if (basetime == gact->param.tcfg_basetime &&
> > +        tko == gact->tk_offset &&
> > +        clockid == gact->param.tcfg_clockid)
> > +        return;
> > +
> > +    hrtimer_cancel(&gact->hitimer);
> > +}
> > +
>
> the above function either cancels a timer, or does nothing: it depends on
> the value of the 3-ple {tcfg_basetime, tk_offset, tcfg_clockid}. If we run
> this function without holding tcf_lock, nobody will guarantee that
> {tcfg_basetime, tk_offset, tcfg_clockid} is not being concurrently
> rewritten by some other command like:
>
> # tc action replace action gate <parameters> index <x>
>
> >  static int tcf_gate_init(struct net *net, struct nlattr *nla,
> >               struct nlattr *est, struct tc_action **a,
> >               int ovr, int bind, bool rtnl_held,
> > @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct
> > nlattr *nla,
> >      gact = to_gate(*a);
> >      if (ret == ACT_P_CREATED)
> >          INIT_LIST_HEAD(&gact->param.entries);
> > +    else
> > +        gate_cancel_timer(gact, basetime, tk_offset, clockid);
> >
>
> IOW, the above line is racy unless we do spin_lock()/spin_unlock() around
> the
>
> if (<expression depending on gact-> members>)
>         return;
>
> statement before hrtimer_cancel(), which does not seem much different
> than what I did in gate_setup_timer().
>
> [...]
>
> > @@ -433,6 +448,11 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> > >         if (goto_ch)
> > >                 tcf_chain_put_by_act(goto_ch);
> > >  release_idr:
> > > +       /* action is not in: hitimer can be inited without taking tcf_lock */
> > > +       if (ret == ACT_P_CREATED)
> > > +               gate_setup_timer(gact, gact->param.tcfg_basetime,
> > > +                                gact->tk_offset, gact->param.tcfg_clockid,
> > > +                                true);
>
> please note, here I felt the need to add a comment, because when ret ==
> ACT_P_CREATED the action is not inserted in any list, so there is no
> concurrent writer of gact-> members for that action.
>

Then please rephrase the comment. I had read it and it still wasn't
clear at all for me what you were talking about.

> > >         tcf_idr_release(*a, bind);
> > >         return err;
> > >  }
>
> problem #2) a functional issue that originates in how 'cycle_time' and
> 'entries' are validated (*). See below:
>
> On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
>
> > static int tcf_gate_init(struct net *net, struct nlattr *nla,
> >               struct nlattr *est, struct tc_action **a,
> >               int ovr, int bind, bool rtnl_held,
> > @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> >      gact = to_gate(*a);
> >      if (ret == ACT_P_CREATED)
> >          INIT_LIST_HEAD(&gact->param.entries);
> > +    else
> > +        gate_cancel_timer(gact, basetime, tk_offset, clockid);
>
> here you propose to cancel the timer, but few lines after we have this:
>
> 385         err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack);
> 386         if (err < 0)
> 387                 goto release_idr;
> 388
>
> so, when users try the following commands:
>
> # tc action add action gate <good parameters> index 2
> # tc action replace action gate <other good parameters> goto chain 42 index 2
>
> and chain 42 does not exist, the second command will fail. But the timer
> is erroneously stopped, and never started again. So, the first rule is
> correctly inserted but it becomes no more functional after users try to
> replace it with another one having invalid control action.
>

Yes, correct.

> Moving the call to gate_cancel_timer() after the validation of the control
> action will not fix this problem, because 'cycle_time' and 'entries' are
> validated together, and with the spinlock taken. Because of this, we need
> to cancel that timer only when we know that we will not do
> tcf_idr_release() and return some error to the user.
>
> please let me know if you think my doubts are not well-founded.
>
> --
> davide
>
> (*) now that I see parse_gate_list() again, I noticed another potential
> issue with replace (that I need to verify first): apparently the list is
> not replaced, it's just "updated" with new entries appended at the end. I
> will try to write a fix for that (separate from this series).
>
>

I wonder, could you call tcf_gate_cleanup instead of just canceling the hrtimer?

Thanks,
-Vladimir

^ permalink raw reply

* KASAN: use-after-free Read in __smsc95xx_mdio_read
From: syzbot @ 2020-06-16 10:44 UTC (permalink / raw)
  To: UNGLinuxDriver, davem, kuba, linux-kernel, linux-usb, netdev,
	steve.glendinning, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    7ae77150 Merge tag 'powerpc-5.8-1' of git://git.kernel.org..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15f83346100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d195fe572fb15312
dashboard link: https://syzkaller.appspot.com/bug?extid=a7ebdb01bb2cc165cab6
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17046c66100000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=140a8a3e100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a7ebdb01bb2cc165cab6@syzkaller.appspotmail.com

smsc95xx 1-1:1.0 eth6: Failed to read reg index 0x00000114: -19
smsc95xx 1-1:1.0 eth6 (unregistering): Error reading MII_ACCESS
smsc95xx 1-1:1.0 eth6 (unregistered): MII is busy in smsc95xx_mdio_read
==================================================================
BUG: KASAN: use-after-free in atomic64_read include/asm-generic/atomic-instrumented.h:836 [inline]
BUG: KASAN: use-after-free in atomic_long_read include/asm-generic/atomic-long.h:28 [inline]
BUG: KASAN: use-after-free in __mutex_unlock_slowpath+0x8e/0x660 kernel/locking/mutex.c:1237
Read of size 8 at addr ffff888094310c38 by task kworker/0:4/6949

CPU: 0 PID: 6949 Comm: kworker/0:4 Not tainted 5.7.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events check_carrier
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x141/0x190 mm/kasan/generic.c:192
 atomic64_read include/asm-generic/atomic-instrumented.h:836 [inline]
 atomic_long_read include/asm-generic/atomic-long.h:28 [inline]
 __mutex_unlock_slowpath+0x8e/0x660 kernel/locking/mutex.c:1237
 __smsc95xx_mdio_read+0x1bc/0x210 drivers/net/usb/smsc95xx.c:217
 smsc95xx_mdio_read drivers/net/usb/smsc95xx.c:278 [inline]
 check_carrier+0xf3/0x1d0 drivers/net/usb/smsc95xx.c:644
 process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
 worker_thread+0x96/0xe20 kernel/workqueue.c:2414
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:351

Allocated by task 6949:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:467
 kmalloc_node include/linux/slab.h:578 [inline]
 kvmalloc_node+0xb4/0xf0 mm/util.c:574
 kvmalloc include/linux/mm.h:752 [inline]
 kvzalloc include/linux/mm.h:760 [inline]
 alloc_netdev_mqs+0x97/0xdc0 net/core/dev.c:9927
 usbnet_probe+0x159/0x2600 drivers/net/usb/usbnet.c:1686
 usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:374
 really_probe+0x281/0x6d0 drivers/base/dd.c:520
 driver_probe_device+0x104/0x210 drivers/base/dd.c:697
 __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:804
 bus_for_each_drv+0x162/0x1e0 drivers/base/bus.c:431
 __device_attach+0x21a/0x360 drivers/base/dd.c:870
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0x132d/0x1c10 drivers/base/core.c:2557
 usb_set_configuration+0xec5/0x1740 drivers/usb/core/message.c:2032
 usb_generic_driver_probe+0x9d/0xe0 drivers/usb/core/generic.c:241
 usb_probe_device+0xc6/0x1f0 drivers/usb/core/driver.c:272
 really_probe+0x281/0x6d0 drivers/base/dd.c:520
 driver_probe_device+0x104/0x210 drivers/base/dd.c:697
 __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:804
 bus_for_each_drv+0x162/0x1e0 drivers/base/bus.c:431
 __device_attach+0x21a/0x360 drivers/base/dd.c:870
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0x132d/0x1c10 drivers/base/core.c:2557
 usb_new_device.cold+0x753/0x103d drivers/usb/core/hub.c:2554
 hub_port_connect drivers/usb/core/hub.c:5208 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5348 [inline]
 port_event drivers/usb/core/hub.c:5494 [inline]
 hub_event+0x1eca/0x38f0 drivers/usb/core/hub.c:5576
 process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
 worker_thread+0x96/0xe20 kernel/workqueue.c:2414
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:351

Freed by task 6849:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x109/0x2b0 mm/slab.c:3757
 kvfree+0x42/0x50 mm/util.c:603
 device_release+0x71/0x200 drivers/base/core.c:1394
 kobject_cleanup lib/kobject.c:693 [inline]
 kobject_release lib/kobject.c:722 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1e7/0x2e0 lib/kobject.c:739
 put_device+0x1b/0x30 drivers/base/core.c:2656
 free_netdev+0x380/0x4a0 net/core/dev.c:10047
 usbnet_disconnect+0x1fb/0x270 drivers/net/usb/usbnet.c:1625
 usb_unbind_interface+0x1bd/0x8a0 drivers/usb/core/driver.c:436
 __device_release_driver drivers/base/dd.c:1110 [inline]
 device_release_driver_internal+0x432/0x500 drivers/base/dd.c:1141
 bus_remove_device+0x2dc/0x4a0 drivers/base/bus.c:533
 device_del+0x481/0xd30 drivers/base/core.c:2734
 usb_disable_device+0x211/0x690 drivers/usb/core/message.c:1245
 usb_disconnect+0x284/0x8d0 drivers/usb/core/hub.c:2217
 hub_port_connect drivers/usb/core/hub.c:5059 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5348 [inline]
 port_event drivers/usb/core/hub.c:5494 [inline]
 hub_event+0x17ca/0x38f0 drivers/usb/core/hub.c:5576
 process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
 worker_thread+0x96/0xe20 kernel/workqueue.c:2414
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:351

The buggy address belongs to the object at ffff888094310000
 which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 3128 bytes inside of
 8192-byte region [ffff888094310000, ffff888094312000)
The buggy address belongs to the page:
page:ffffea000250c400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea000250c400 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea0002257808 ffffea0002548608 ffff8880aa0021c0
raw: 0000000000000000 ffff888094310000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888094310b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888094310b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888094310c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                        ^
 ffff888094310c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888094310d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* [PATCH rdma-next v2 00/11] RAW format dumps through RDMAtool
From: Leon Romanovsky @ 2020-06-16 10:39 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Jakub Kicinski, Lijun Ou, linux-rdma,
	Maor Gottlieb, netdev, Potnuri Bharat Teja, Saeed Mahameed,
	Weihang Li, Wei Hu(Xavier)

From: Leon Romanovsky <leonro@mellanox.com>

Changelog:
v2:
 * Converted to specific nldev ops for RAW.
 * Rebased on top of v5.8-rc1.
v1: https://lore.kernel.org/linux-rdma/20200527135408.480878-1-leon@kernel.org
 * Maor dropped controversial change to dummy interface.
v0:
https://lore.kernel.org/linux-rdma/20200513095034.208385-1-leon@kernel.org

------------------------------------------------------------------------------

Hi,

The following series adds support to get the RDMA resource data in RAW
format. The main motivation for doing this is to enable vendors to return
the entire QP/CQ/MR data without a need from the vendor to set each
field separately.

Thanks

Maor Gottlieb (11):
  net/mlx5: Export resource dump interface
  net/mlx5: Add support in query QP, CQ and MKEY segments
  RDMA/core: Don't call fill_res_entry for PD
  RDMA: Add dedicated MR resource tracker function
  RDMA: Add a dedicated CQ resource tracker function
  RDMA: Add dedicated QP resource tracker function
  RDMA: Add dedicated CM_ID resource tracker function
  RDMA: Add support to dump resource tracker in RAW format
  RDMA/mlx5: Add support to get QP resource in RAW format
  RDMA/mlx5: Add support to get CQ resource in RAW format
  RDMA/mlx5: Add support to get MR resource in RAW format

 drivers/infiniband/core/device.c              |  10 +-
 drivers/infiniband/core/nldev.c               | 176 +++++++++++-------
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h        |   7 +-
 drivers/infiniband/hw/cxgb4/provider.c        |  11 +-
 drivers/infiniband/hw/cxgb4/restrack.c        |  24 +--
 drivers/infiniband/hw/hns/hns_roce_device.h   |   4 +-
 drivers/infiniband/hw/hns/hns_roce_main.c     |   2 +-
 drivers/infiniband/hw/hns/hns_roce_restrack.c |  14 +-
 drivers/infiniband/hw/mlx5/main.c             |   7 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |   9 +-
 drivers/infiniband/hw/mlx5/restrack.c         | 105 +++++++++--
 .../mellanox/mlx5/core/diag/rsc_dump.c        |   6 +
 .../mellanox/mlx5/core/diag/rsc_dump.h        |  33 +---
 .../diag => include/linux/mlx5}/rsc_dump.h    |  25 +--
 include/rdma/ib_verbs.h                       |  13 +-
 include/uapi/rdma/rdma_netlink.h              |   8 +
 16 files changed, 264 insertions(+), 190 deletions(-)
 copy {drivers/net/ethernet/mellanox/mlx5/core/diag => include/linux/mlx5}/rsc_dump.h (68%)

--
2.26.2


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox