Netdev List
 help / color / mirror / Atom feed
* [PATCH bpf-next 1/7] bpf: Expose check_uarg_tail_zero()
From: Martin KaFai Lau @ 2018-05-19  0:16 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180519001650.4043980-1-kafai@fb.com>

This patch exposes check_uarg_tail_zero() which will
be reused by a later BTF patch.  Its name is changed to
bpf_check_uarg_tail_zero().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/linux/bpf.h  |  2 ++
 kernel/bpf/syscall.c | 14 +++++++-------
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ed0122b45b63..f6fe3c719ca8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -463,6 +463,8 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
 int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
 
 int bpf_get_file_flag(int flags);
+int bpf_check_uarg_tail_zero(void __user *uaddr, size_t expected_size,
+			     size_t actual_size);
 
 /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
  * forced to use 'long' read/writes to try to atomically copy long counters.
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index bfcde949c7f8..2b29ef84ded3 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -65,9 +65,9 @@ static const struct bpf_map_ops * const bpf_map_types[] = {
  * copy_from_user() call. However, this is not a concern since this function is
  * meant to be a future-proofing of bits.
  */
-static int check_uarg_tail_zero(void __user *uaddr,
-				size_t expected_size,
-				size_t actual_size)
+int bpf_check_uarg_tail_zero(void __user *uaddr,
+			     size_t expected_size,
+			     size_t actual_size)
 {
 	unsigned char __user *addr;
 	unsigned char __user *end;
@@ -1899,7 +1899,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	u32 ulen;
 	int err;
 
-	err = check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -1998,7 +1998,7 @@ static int bpf_map_get_info_by_fd(struct bpf_map *map,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -2038,7 +2038,7 @@ static int bpf_btf_get_info_by_fd(struct btf *btf,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = check_uarg_tail_zero(uinfo, sizeof(*uinfo), info_len);
+	err = bpf_check_uarg_tail_zero(uinfo, sizeof(*uinfo), info_len);
 	if (err)
 		return err;
 
@@ -2110,7 +2110,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	err = check_uarg_tail_zero(uattr, sizeof(attr), size);
+	err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
 	if (err)
 		return err;
 	size = min_t(u32, size, sizeof(attr));
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 2/7] bpf: btf: Change how section is supported in btf_header
From: Martin KaFai Lau @ 2018-05-19  0:16 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180519001650.4043980-1-kafai@fb.com>

There are currently unused section descriptions in the btf_header.  Those
sections are here to support future BTF use cases.  For example, the
func section (func_off) is to support function signature (e.g. the BPF
prog function signature).

Instead of spelling out all potential sections up-front in the btf_header.
This patch makes changes to btf_header such that extending it (e.g. adding
a section) is possible later.  The unused ones can be removed for now and
they can be added back later.

This patch:
1. adds a hdr_len to the btf_header.  It will allow adding
sections (and other info like parent_label and parent_name)
later.  The check is similar to the existing bpf_attr.
If a user passes in a longer hdr_len, the kernel
ensures the extra tailing bytes are 0.

2. allows the section order in the BTF object to be
different from its sec_off order in btf_header.

3. each sec_off is followed by a sec_len.  It must not have gap or
overlapping among sections.

The string section is ensured to be at the end due to the 4 bytes
alignment requirement of the type section.

The above changes will allow enough flexibility to
add new sections (and other info) to the btf_header later.

This patch also removes an unnecessary !err check
at the end of btf_parse().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/uapi/linux/btf.h |   8 +-
 kernel/bpf/btf.c         | 207 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 158 insertions(+), 57 deletions(-)

diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index bcb56ee47014..4fa479741a02 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -12,15 +12,11 @@ struct btf_header {
 	__u16	magic;
 	__u8	version;
 	__u8	flags;
-
-	__u32	parent_label;
-	__u32	parent_name;
+	__u32	hdr_len;
 
 	/* All offsets are in bytes relative to the end of this header */
-	__u32	label_off;	/* offset of label section	*/
-	__u32	object_off;	/* offset of data object section*/
-	__u32	func_off;	/* offset of function section	*/
 	__u32	type_off;	/* offset of type section	*/
+	__u32	type_len;	/* length of type section	*/
 	__u32	str_off;	/* offset of string section	*/
 	__u32	str_len;	/* length of string section	*/
 };
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index ded10ab47b8a..536e5981ad8c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -12,6 +12,7 @@
 #include <linux/uaccess.h>
 #include <linux/kernel.h>
 #include <linux/idr.h>
+#include <linux/sort.h>
 #include <linux/bpf_verifier.h>
 #include <linux/btf.h>
 
@@ -184,15 +185,13 @@ static DEFINE_IDR(btf_idr);
 static DEFINE_SPINLOCK(btf_idr_lock);
 
 struct btf {
-	union {
-		struct btf_header *hdr;
-		void *data;
-	};
+	void *data;
 	struct btf_type **types;
 	u32 *resolved_ids;
 	u32 *resolved_sizes;
 	const char *strings;
 	void *nohdr_data;
+	struct btf_header hdr;
 	u32 nr_types;
 	u32 types_size;
 	u32 data_size;
@@ -227,6 +226,12 @@ enum resolve_mode {
 };
 
 #define MAX_RESOLVE_DEPTH 32
+#define NR_SECS 2
+
+struct btf_sec_info {
+	u32 off;
+	u32 len;
+};
 
 struct btf_verifier_env {
 	struct btf *btf;
@@ -418,14 +423,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
 static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
 {
 	return !BTF_STR_TBL_ELF_ID(offset) &&
-		BTF_STR_OFFSET(offset) < btf->hdr->str_len;
+		BTF_STR_OFFSET(offset) < btf->hdr.str_len;
 }
 
 static const char *btf_name_by_offset(const struct btf *btf, u32 offset)
 {
 	if (!BTF_STR_OFFSET(offset))
 		return "(anon)";
-	else if (BTF_STR_OFFSET(offset) < btf->hdr->str_len)
+	else if (BTF_STR_OFFSET(offset) < btf->hdr.str_len)
 		return &btf->strings[BTF_STR_OFFSET(offset)];
 	else
 		return "(invalid-name-offset)";
@@ -536,7 +541,8 @@ static void btf_verifier_log_member(struct btf_verifier_env *env,
 	__btf_verifier_log(log, "\n");
 }
 
-static void btf_verifier_log_hdr(struct btf_verifier_env *env)
+static void btf_verifier_log_hdr(struct btf_verifier_env *env,
+				 u32 btf_data_size)
 {
 	struct bpf_verifier_log *log = &env->log;
 	const struct btf *btf = env->btf;
@@ -545,19 +551,16 @@ static void btf_verifier_log_hdr(struct btf_verifier_env *env)
 	if (!bpf_verifier_log_needed(log))
 		return;
 
-	hdr = btf->hdr;
+	hdr = &btf->hdr;
 	__btf_verifier_log(log, "magic: 0x%x\n", hdr->magic);
 	__btf_verifier_log(log, "version: %u\n", hdr->version);
 	__btf_verifier_log(log, "flags: 0x%x\n", hdr->flags);
-	__btf_verifier_log(log, "parent_label: %u\n", hdr->parent_label);
-	__btf_verifier_log(log, "parent_name: %u\n", hdr->parent_name);
-	__btf_verifier_log(log, "label_off: %u\n", hdr->label_off);
-	__btf_verifier_log(log, "object_off: %u\n", hdr->object_off);
-	__btf_verifier_log(log, "func_off: %u\n", hdr->func_off);
+	__btf_verifier_log(log, "hdr_len: %u\n", hdr->hdr_len);
 	__btf_verifier_log(log, "type_off: %u\n", hdr->type_off);
+	__btf_verifier_log(log, "type_len: %u\n", hdr->type_len);
 	__btf_verifier_log(log, "str_off: %u\n", hdr->str_off);
 	__btf_verifier_log(log, "str_len: %u\n", hdr->str_len);
-	__btf_verifier_log(log, "btf_total_size: %u\n", btf->data_size);
+	__btf_verifier_log(log, "btf_total_size: %u\n", btf_data_size);
 }
 
 static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t)
@@ -1754,9 +1757,9 @@ static int btf_check_all_metas(struct btf_verifier_env *env)
 	struct btf_header *hdr;
 	void *cur, *end;
 
-	hdr = btf->hdr;
+	hdr = &btf->hdr;
 	cur = btf->nohdr_data + hdr->type_off;
-	end = btf->nohdr_data + hdr->str_off;
+	end = btf->nohdr_data + hdr->type_len;
 
 	env->log_type_id = 1;
 	while (cur < end) {
@@ -1866,8 +1869,20 @@ static int btf_check_all_types(struct btf_verifier_env *env)
 
 static int btf_parse_type_sec(struct btf_verifier_env *env)
 {
+	const struct btf_header *hdr = &env->btf->hdr;
 	int err;
 
+	/* Type section must align to 4 bytes */
+	if (hdr->type_off & (sizeof(u32) - 1)) {
+		btf_verifier_log(env, "Unaligned type_off");
+		return -EINVAL;
+	}
+
+	if (!hdr->type_len) {
+		btf_verifier_log(env, "No type found");
+		return -EINVAL;
+	}
+
 	err = btf_check_all_metas(env);
 	if (err)
 		return err;
@@ -1881,10 +1896,15 @@ static int btf_parse_str_sec(struct btf_verifier_env *env)
 	struct btf *btf = env->btf;
 	const char *start, *end;
 
-	hdr = btf->hdr;
+	hdr = &btf->hdr;
 	start = btf->nohdr_data + hdr->str_off;
 	end = start + hdr->str_len;
 
+	if (end != btf->data + btf->data_size) {
+		btf_verifier_log(env, "String section is not at the end");
+		return -EINVAL;
+	}
+
 	if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_NAME_OFFSET ||
 	    start[0] || end[-1]) {
 		btf_verifier_log(env, "Invalid string section");
@@ -1896,20 +1916,119 @@ static int btf_parse_str_sec(struct btf_verifier_env *env)
 	return 0;
 }
 
-static int btf_parse_hdr(struct btf_verifier_env *env)
+static const size_t btf_sec_info_offset[] = {
+	offsetof(struct btf_header, type_off),
+	offsetof(struct btf_header, str_off),
+};
+
+static int btf_sec_info_cmp(const void *a, const void *b)
 {
+	const struct btf_sec_info *x = a;
+	const struct btf_sec_info *y = b;
+
+	return (int)(x->off - y->off) ? : (int)(x->len - y->len);
+}
+
+static int btf_check_sec_info(struct btf_verifier_env *env,
+			      u32 btf_data_size)
+{
+	struct btf_sec_info secs[NR_SECS];
+	u32 total, expected_total, i;
 	const struct btf_header *hdr;
-	struct btf *btf = env->btf;
-	u32 meta_left;
+	const struct btf *btf;
+
+	BUILD_BUG_ON(ARRAY_SIZE(btf_sec_info_offset) != NR_SECS);
+
+	btf = env->btf;
+	hdr = &btf->hdr;
+
+	/* Populate the secs from hdr */
+	for (i = 0; i < NR_SECS; i++)
+		secs[i] = *(struct btf_sec_info *)((void *)hdr +
+						   btf_sec_info_offset[i]);
+
+	sort(secs, NR_SECS, sizeof(struct btf_sec_info),
+	     btf_sec_info_cmp, NULL);
+
+	/* Check for gaps and overlap among sections */
+	total = 0;
+	expected_total = btf_data_size - hdr->hdr_len;
+	for (i = 0; i < NR_SECS; i++) {
+		if (expected_total < secs[i].off) {
+			btf_verifier_log(env, "Invalid section offset");
+			return -EINVAL;
+		}
+		if (total < secs[i].off) {
+			/* gap */
+			btf_verifier_log(env, "Unsupported section found");
+			return -EINVAL;
+		}
+		if (total > secs[i].off) {
+			btf_verifier_log(env, "Section overlap found");
+			return -EINVAL;
+		}
+		if (expected_total - total < secs[i].len) {
+			btf_verifier_log(env,
+					 "Total section length too long");
+			return -EINVAL;
+		}
+		total += secs[i].len;
+	}
+
+	/* There is data other than hdr and known sections */
+	if (expected_total != total) {
+		btf_verifier_log(env, "Unsupported section found");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int btf_parse_hdr(struct btf_verifier_env *env, void __user *btf_data,
+			 u32 btf_data_size)
+{
+	const struct btf_header *hdr;
+	u32 hdr_len, hdr_copy;
+	struct btf_min_header {
+		u16	magic;
+		u8	version;
+		u8	flags;
+		u32	hdr_len;
+	} __user *min_hdr;
+	struct btf *btf;
+	int err;
+
+	btf = env->btf;
+	min_hdr = btf_data;
+
+	if (btf_data_size < sizeof(*min_hdr)) {
+		btf_verifier_log(env, "hdr_len not found");
+		return -EINVAL;
+	}
 
-	if (btf->data_size < sizeof(*hdr)) {
+	if (get_user(hdr_len, &min_hdr->hdr_len))
+		return -EFAULT;
+
+	if (btf_data_size < hdr_len) {
 		btf_verifier_log(env, "btf_header not found");
 		return -EINVAL;
 	}
 
-	btf_verifier_log_hdr(env);
+	err = bpf_check_uarg_tail_zero(btf_data, sizeof(btf->hdr), hdr_len);
+	if (err) {
+		if (err == -E2BIG)
+			btf_verifier_log(env, "Unsupported btf_header");
+		return err;
+	}
+
+	hdr_copy = min_t(u32, hdr_len, sizeof(btf->hdr));
+	if (copy_from_user(&btf->hdr, btf_data, hdr_copy))
+		return -EFAULT;
+
+	hdr = &btf->hdr;
+
+	btf_verifier_log_hdr(env, btf_data_size);
 
-	hdr = btf->hdr;
 	if (hdr->magic != BTF_MAGIC) {
 		btf_verifier_log(env, "Invalid magic");
 		return -EINVAL;
@@ -1925,26 +2044,14 @@ static int btf_parse_hdr(struct btf_verifier_env *env)
 		return -ENOTSUPP;
 	}
 
-	meta_left = btf->data_size - sizeof(*hdr);
-	if (!meta_left) {
+	if (btf_data_size == hdr->hdr_len) {
 		btf_verifier_log(env, "No data");
 		return -EINVAL;
 	}
 
-	if (meta_left < hdr->type_off || hdr->str_off <= hdr->type_off ||
-	    /* Type section must align to 4 bytes */
-	    hdr->type_off & (sizeof(u32) - 1)) {
-		btf_verifier_log(env, "Invalid type_off");
-		return -EINVAL;
-	}
-
-	if (meta_left < hdr->str_off ||
-	    meta_left - hdr->str_off < hdr->str_len) {
-		btf_verifier_log(env, "Invalid str_off or str_len");
-		return -EINVAL;
-	}
-
-	btf->nohdr_data = btf->hdr + 1;
+	err = btf_check_sec_info(env, btf_data_size);
+	if (err)
+		return err;
 
 	return 0;
 }
@@ -1987,6 +2094,11 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 		err = -ENOMEM;
 		goto errout;
 	}
+	env->btf = btf;
+
+	err = btf_parse_hdr(env, btf_data, btf_data_size);
+	if (err)
+		goto errout;
 
 	data = kvmalloc(btf_data_size, GFP_KERNEL | __GFP_NOWARN);
 	if (!data) {
@@ -1996,18 +2108,13 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 
 	btf->data = data;
 	btf->data_size = btf_data_size;
+	btf->nohdr_data = btf->data + btf->hdr.hdr_len;
 
 	if (copy_from_user(data, btf_data, btf_data_size)) {
 		err = -EFAULT;
 		goto errout;
 	}
 
-	env->btf = btf;
-
-	err = btf_parse_hdr(env);
-	if (err)
-		goto errout;
-
 	err = btf_parse_str_sec(env);
 	if (err)
 		goto errout;
@@ -2016,16 +2123,14 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 	if (err)
 		goto errout;
 
-	if (!err && log->level && bpf_verifier_log_full(log)) {
+	if (log->level && bpf_verifier_log_full(log)) {
 		err = -ENOSPC;
 		goto errout;
 	}
 
-	if (!err) {
-		btf_verifier_env_free(env);
-		refcount_set(&btf->refcnt, 1);
-		return btf;
-	}
+	btf_verifier_env_free(env);
+	refcount_set(&btf->refcnt, 1);
+	return btf;
 
 errout:
 	btf_verifier_env_free(env);
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 5/7] bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info
From: Martin KaFai Lau @ 2018-05-19  0:16 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180519001650.4043980-1-kafai@fb.com>

In "struct bpf_map_info", the name "btf_id", "btf_key_id" and "btf_value_id"
could cause confusion because the "id" of "btf_id" means the BPF obj id
given to the BTF object while
"btf_key_id" and "btf_value_id" means the BTF type id within
that BTF object.

To make it clear, btf_key_id and btf_value_id are
renamed to btf_key_type_id and btf_value_type_id.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/linux/bpf.h      |  4 ++--
 include/uapi/linux/bpf.h |  8 ++++----
 kernel/bpf/arraymap.c    |  2 +-
 kernel/bpf/syscall.c     | 18 +++++++++---------
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f6fe3c719ca8..1795eeee846c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -69,8 +69,8 @@ struct bpf_map {
 	u32 pages;
 	u32 id;
 	int numa_node;
-	u32 btf_key_id;
-	u32 btf_value_id;
+	u32 btf_key_type_id;
+	u32 btf_value_type_id;
 	struct btf *btf;
 	bool unpriv_array;
 	/* 55 bytes hole */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d94d333a8225..123ebe4b3662 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -284,8 +284,8 @@ union bpf_attr {
 		char	map_name[BPF_OBJ_NAME_LEN];
 		__u32	map_ifindex;	/* ifindex of netdev to create on */
 		__u32	btf_fd;		/* fd pointing to a BTF type data */
-		__u32	btf_key_id;	/* BTF type_id of the key */
-		__u32	btf_value_id;	/* BTF type_id of the value */
+		__u32	btf_key_type_id;	/* BTF type_id of the key */
+		__u32	btf_value_type_id;	/* BTF type_id of the value */
 	};
 
 	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -2211,8 +2211,8 @@ struct bpf_map_info {
 	__u64 netns_dev;
 	__u64 netns_ino;
 	__u32 btf_id;
-	__u32 btf_key_id;
-	__u32 btf_value_id;
+	__u32 btf_key_type_id;
+	__u32 btf_value_type_id;
 } __attribute__((aligned(8)));
 
 struct bpf_btf_info {
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 0fd8d8f1a398..544e58f5f642 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -352,7 +352,7 @@ static void array_map_seq_show_elem(struct bpf_map *map, void *key,
 	}
 
 	seq_printf(m, "%u: ", *(u32 *)key);
-	btf_type_seq_show(map->btf, map->btf_value_id, value, m);
+	btf_type_seq_show(map->btf, map->btf_value_type_id, value, m);
 	seq_puts(m, "\n");
 
 	rcu_read_unlock();
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2b29ef84ded3..0b4c94551001 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -422,7 +422,7 @@ static int bpf_obj_name_cpy(char *dst, const char *src)
 	return 0;
 }
 
-#define BPF_MAP_CREATE_LAST_FIELD btf_value_id
+#define BPF_MAP_CREATE_LAST_FIELD btf_value_type_id
 /* called via syscall */
 static int map_create(union bpf_attr *attr)
 {
@@ -457,10 +457,10 @@ static int map_create(union bpf_attr *attr)
 	atomic_set(&map->usercnt, 1);
 
 	if (bpf_map_support_seq_show(map) &&
-	    (attr->btf_key_id || attr->btf_value_id)) {
+	    (attr->btf_key_type_id || attr->btf_value_type_id)) {
 		struct btf *btf;
 
-		if (!attr->btf_key_id || !attr->btf_value_id) {
+		if (!attr->btf_key_type_id || !attr->btf_value_type_id) {
 			err = -EINVAL;
 			goto free_map_nouncharge;
 		}
@@ -471,16 +471,16 @@ static int map_create(union bpf_attr *attr)
 			goto free_map_nouncharge;
 		}
 
-		err = map->ops->map_check_btf(map, btf, attr->btf_key_id,
-					      attr->btf_value_id);
+		err = map->ops->map_check_btf(map, btf, attr->btf_key_type_id,
+					      attr->btf_value_type_id);
 		if (err) {
 			btf_put(btf);
 			goto free_map_nouncharge;
 		}
 
 		map->btf = btf;
-		map->btf_key_id = attr->btf_key_id;
-		map->btf_value_id = attr->btf_value_id;
+		map->btf_key_type_id = attr->btf_key_type_id;
+		map->btf_value_type_id = attr->btf_value_type_id;
 	}
 
 	err = security_bpf_map_alloc(map);
@@ -2013,8 +2013,8 @@ static int bpf_map_get_info_by_fd(struct bpf_map *map,
 
 	if (map->btf) {
 		info.btf_id = btf_id(map->btf);
-		info.btf_key_id = map->btf_key_id;
-		info.btf_value_id = map->btf_value_id;
+		info.btf_key_type_id = map->btf_key_type_id;
+		info.btf_value_type_id = map->btf_value_type_id;
 	}
 
 	if (bpf_map_is_dev_bound(map)) {
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 4/7] bpf: btf: Remove unused bits from uapi/linux/btf.h
From: Martin KaFai Lau @ 2018-05-19  0:16 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180519001650.4043980-1-kafai@fb.com>

This patch does the followings:
1. Limit BTF_MAX_TYPES and BTF_MAX_NAME_OFFSET to 64k.  We can
   raise it later.

2. Remove the BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID.  They are
   currently encoded at the highest bit of a u32.
   It is because the current use case does not require supporting
   parent type (i.e type_id referring to a type in another BTF file).
   It also does not support referring to a string in ELF.

   The BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID checks are replaced
   by BTF_TYPE_ID_CHECK and BTF_STR_OFFSET_CHECK which are
   defined in btf.c instead of uapi/linux/btf.h.

3. Limit the BTF_INFO_KIND from 5 bits to 4 bits which is enough.
   There is unused bits headroom if we ever needed it later.

4. The root bit in BTF_INFO is also removed because it is not
   used in the current use case.

The above can be added back later because the verifier
ensures the unused bits are zeros.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/uapi/linux/btf.h | 20 +++++---------------
 kernel/bpf/btf.c         | 34 +++++++++++++++++++++-------------
 2 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index 4fa479741a02..b89b56f2b099 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -22,28 +22,19 @@ struct btf_header {
 };
 
 /* Max # of type identifier */
-#define BTF_MAX_TYPE	0x7fffffff
+#define BTF_MAX_TYPE	0x0000ffff
 /* Max offset into the string section */
-#define BTF_MAX_NAME_OFFSET	0x7fffffff
+#define BTF_MAX_NAME_OFFSET	0x0000ffff
 /* Max # of struct/union/enum members or func args */
 #define BTF_MAX_VLEN	0xffff
 
-/* The type id is referring to a parent BTF */
-#define BTF_TYPE_PARENT(id)	(((id) >> 31) & 0x1)
-#define BTF_TYPE_ID(id)		((id) & BTF_MAX_TYPE)
-
-/* String is in the ELF string section */
-#define BTF_STR_TBL_ELF_ID(ref)	(((ref) >> 31) & 0x1)
-#define BTF_STR_OFFSET(ref)	((ref) & BTF_MAX_NAME_OFFSET)
-
 struct btf_type {
 	__u32 name_off;
 	/* "info" bits arrangement
 	 * bits  0-15: vlen (e.g. # of struct's members)
 	 * bits 16-23: unused
-	 * bits 24-28: kind (e.g. int, ptr, array...etc)
-	 * bits 29-30: unused
-	 * bits    31: root
+	 * bits 24-27: kind (e.g. int, ptr, array...etc)
+	 * bits 28-31: unused
 	 */
 	__u32 info;
 	/* "size" is used by INT, ENUM, STRUCT and UNION.
@@ -58,8 +49,7 @@ struct btf_type {
 	};
 };
 
-#define BTF_INFO_KIND(info)	(((info) >> 24) & 0x1f)
-#define BTF_INFO_ISROOT(info)	(!!(((info) >> 24) & 0x80))
+#define BTF_INFO_KIND(info)	(((info) >> 24) & 0x0f)
 #define BTF_INFO_VLEN(info)	((info) & 0xffff)
 
 #define BTF_KIND_UNKN		0	/* Unknown	*/
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index b4e48dae2240..5d1967d4fb62 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -163,13 +163,15 @@
 #define BITS_ROUNDUP_BYTES(bits) \
 	(BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits))
 
+#define BTF_INFO_MASK 0x0f00ffff
+#define BTF_TYPE_ID_CHECK(type_id) ((type_id) <= BTF_MAX_TYPE)
+#define BTF_STR_OFFSET_CHECK(name_off) ((name_off) <= BTF_MAX_NAME_OFFSET)
+
 /* 16MB for 64k structs and each has 16 members and
  * a few MB spaces for the string section.
  * The hard limit is S32_MAX.
  */
 #define BTF_MAX_SIZE (16 * 1024 * 1024)
-/* 64k. We can raise it later. The hard limit is S32_MAX. */
-#define BTF_MAX_NR_TYPES 65535
 
 #define for_each_member(i, struct_type, member)			\
 	for (i = 0, member = btf_type_member(struct_type);	\
@@ -422,16 +424,16 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
 
 static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
 {
-	return !BTF_STR_TBL_ELF_ID(offset) &&
-		BTF_STR_OFFSET(offset) < btf->hdr.str_len;
+	return BTF_STR_OFFSET_CHECK(offset) &&
+		offset < btf->hdr.str_len;
 }
 
 static const char *btf_name_by_offset(const struct btf *btf, u32 offset)
 {
-	if (!BTF_STR_OFFSET(offset))
+	if (!offset)
 		return "(anon)";
-	else if (BTF_STR_OFFSET(offset) < btf->hdr.str_len)
-		return &btf->strings[BTF_STR_OFFSET(offset)];
+	else if (offset < btf->hdr.str_len)
+		return &btf->strings[offset];
 	else
 		return "(invalid-name-offset)";
 }
@@ -599,13 +601,13 @@ static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t)
 		struct btf_type **new_types;
 		u32 expand_by, new_size;
 
-		if (btf->types_size == BTF_MAX_NR_TYPES) {
+		if (btf->types_size == BTF_MAX_TYPE) {
 			btf_verifier_log(env, "Exceeded max num of types");
 			return -E2BIG;
 		}
 
 		expand_by = max_t(u32, btf->types_size >> 2, 16);
-		new_size = min_t(u32, BTF_MAX_NR_TYPES,
+		new_size = min_t(u32, BTF_MAX_TYPE,
 				 btf->types_size + expand_by);
 
 		new_types = kvzalloc(new_size * sizeof(*new_types),
@@ -1127,7 +1129,7 @@ static int btf_ref_type_check_meta(struct btf_verifier_env *env,
 		return -EINVAL;
 	}
 
-	if (BTF_TYPE_PARENT(t->type)) {
+	if (!BTF_TYPE_ID_CHECK(t->type)) {
 		btf_verifier_log_type(env, t, "Invalid type_id");
 		return -EINVAL;
 	}
@@ -1334,12 +1336,12 @@ static s32 btf_array_check_meta(struct btf_verifier_env *env,
 	/* Array elem type and index type cannot be in type void,
 	 * so !array->type and !array->index_type are not allowed.
 	 */
-	if (!array->type || BTF_TYPE_PARENT(array->type)) {
+	if (!array->type || !BTF_TYPE_ID_CHECK(array->type)) {
 		btf_verifier_log_type(env, t, "Invalid elem");
 		return -EINVAL;
 	}
 
-	if (!array->index_type || BTF_TYPE_PARENT(array->index_type)) {
+	if (!array->index_type || !BTF_TYPE_ID_CHECK(array->index_type)) {
 		btf_verifier_log_type(env, t, "Invalid index");
 		return -EINVAL;
 	}
@@ -1511,7 +1513,7 @@ static s32 btf_struct_check_meta(struct btf_verifier_env *env,
 		}
 
 		/* A member cannot be in type void */
-		if (!member->type || BTF_TYPE_PARENT(member->type)) {
+		if (!member->type || !BTF_TYPE_ID_CHECK(member->type)) {
 			btf_verifier_log_member(env, t, member,
 						"Invalid type_id");
 			return -EINVAL;
@@ -1764,6 +1766,12 @@ static s32 btf_check_meta(struct btf_verifier_env *env,
 	}
 	meta_left -= sizeof(*t);
 
+	if (t->info & ~BTF_INFO_MASK) {
+		btf_verifier_log(env, "[%u] Invalid btf_info:%x",
+				 env->log_type_id, t->info);
+		return -EINVAL;
+	}
+
 	if (BTF_INFO_KIND(t->info) > BTF_KIND_MAX ||
 	    BTF_INFO_KIND(t->info) == BTF_KIND_UNKN) {
 		btf_verifier_log(env, "[%u] Invalid kind:%u",
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH iproute2] ip link: Do not call ll_name_to_index when creating a new link
From: David Ahern @ 2018-05-18 23:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20180518150816.70901564@xeon-e3>

On 5/18/18 4:08 PM, Stephen Hemminger wrote:
> 
> What about just pushing the lookup down to the leaf functions that need it?
> 

That should work as well. You want to re-send a formal patch?

^ permalink raw reply

* general protection fault in smc_ioctl
From: syzbot @ 2018-05-18 23:25 UTC (permalink / raw)
  To: davem, linux-kernel, linux-s390, netdev, syzkaller-bugs, ubraun

Hello,

syzbot found the following crash on:

HEAD commit:    1f7455c3912d tcp: tcp_rack_reo_wnd() can be static
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=171a1337800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=b632d8e2c2ab2c1
dashboard link: https://syzkaller.appspot.com/bug?extid=e6714328fda813fc670f
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15782d57800000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=108711a7800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+e6714328fda813fc670f@syzkaller.appspotmail.com

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
    (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4559 Comm: syz-executor292 Not tainted 4.17.0-rc4+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:smc_ioctl+0x3dc/0x9f0 net/smc/af_smc.c:1499
RSP: 0018:ffff8801ad22f770 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8801ad0df7c0 RCX: ffffffff8741188f
RDX: 0000000000000004 RSI: ffffffff8741189e RDI: 0000000000000020
RBP: ffff8801ad22f9d0 R08: ffff8801ae87e6c0 R09: ffffed00363e1818
R10: ffffed00363e1818 R11: ffff8801b1f0c0c3 R12: 1ffff10035a45ef1
R13: 0000000020000080 R14: 0000000000000000 R15: 0000000000000000
FS:  00000000017b7880(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd1f18f038 CR3: 00000001ad044000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  sock_do_ioctl+0xe4/0x3e0 net/socket.c:957
  sock_ioctl+0x30d/0x680 net/socket.c:1081
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:500 [inline]
  do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684
  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
  __do_sys_ioctl fs/ioctl.c:708 [inline]
  __se_sys_ioctl fs/ioctl.c:706 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43fca9
RSP: 002b:00007ffd1f073588 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fca9
RDX: 0000000020000080 RSI: 0000000000005411 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000213 R12: 00000000004015d0
R13: 0000000000401660 R14: 0000000000000000 R15: 0000000000000000
Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 7d 05 00 00 4c 8b b3 90 04 00 00 48  
b8 00 00 00 00 00 fc ff df 49 8d 7e 20 48 89 fa 48 c1 ea 03 <0f> b6 04 02  
84 c0 74 08 3c 03 0f 8e 47 05 00 00 45 8b 7e 20 4c
RIP: smc_ioctl+0x3dc/0x9f0 net/smc/af_smc.c:1499 RSP: ffff8801ad22f770
---[ end trace b586e1eb098f7714 ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: [PATCH iproute2] Allow to configure /var/run/netns directory
From: Pavel Maltsev @ 2018-05-18 22:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Lorenzo Colitti
In-Reply-To: <20180518145306.1e7632b0@xeon-e3>

Thanks, Stephen,

I've uploaded new patch as you suggested by putting these
variables in the makefile rather than configure script.

On Fri, May 18, 2018 at 2:53 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 15 May 2018 14:49:46 -0700
> Pavel Maltsev <pavelm@google.com> wrote:
>
> > Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
> > However, some systems (e.g. Android) doesn't have /var
> > which results in error attempts to create network namespaces on these
> > systems.  This change makes NETNS_RUN_DIR configurable at build time
> > by allowing to pass environment variable to configre script.
> >
> > For example: NETNS_RUN_DIR=/mnt/vendor/netns ./configure && make
> >
> > Tested: verified that iproute2 with configuration mentioned above
> > creates namespaces in /mnt/vendor/netns
> >
> > Signed-off-by: Pavel Maltsev <pavelm@google.com>
>
> The directory path should definitely be overrideable on the build.
> The configure script is already messy enough, lets do it instead like
> the other runtime directories are already done ARPDDIR and CONFDIR.
>
> Something like?
>
> diff --git a/Makefile b/Makefile
> index b526d3b5b5c4..ab828669e711 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -16,6 +16,7 @@ PREFIX?=/usr
>  LIBDIR?=$(PREFIX)/lib
>  SBINDIR?=/sbin
>  CONFDIR?=/etc/iproute2
> +NETNS_RUN_DIR?=/var/run/netns
>  DATADIR?=$(PREFIX)/share
>  HDRDIR?=$(PREFIX)/include/iproute2
>  DOCDIR?=$(DATADIR)/doc/iproute2
> @@ -34,7 +35,7 @@ ifneq ($(SHARED_LIBS),y)
>  DEFINES+= -DNO_SHARED_LIBS
>  endif
>
> -DEFINES+=-DCONFDIR=\"$(CONFDIR)\"
> +DEFINES+=-DCONFDIR=\"$(CONFDIR)\" -DNETNS_RUN_DIR=\"$(NETNS_RUN_DIR)\"
>
>  #options for decnet
>  ADDLIB+=dnet_ntop.o dnet_pton.o
> diff --git a/include/namespace.h b/include/namespace.h
> index aed7ce08507f..e47f9b5d49d1 100644
> --- a/include/namespace.h
> +++ b/include/namespace.h
> @@ -8,8 +8,13 @@
>  #include <sys/syscall.h>
>  #include <errno.h>
>
> +#ifndef NETNS_RUN_DIR
>  #define NETNS_RUN_DIR "/var/run/netns"
> +#endif
> +
> +#ifndef NETNS_ETC_DIR
>  #define NETNS_ETC_DIR "/etc/netns"
> +#endif
>
>  #ifndef CLONE_NEWNET
>  #define CLONE_NEWNET 0x40000000        /* New network namespace (lo, device, names sockets, etc) */
>

^ permalink raw reply

* [PATCH iproute2] Allow to configure /var/run/netns directory
From: Pavel Maltsev @ 2018-05-18 22:44 UTC (permalink / raw)
  To: netdev, lorenzo, stephen, pavelm; +Cc: lorenzo, stephen, Pavel Maltsev

Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
However, some systems (e.g. Android) doesn't have /var
which results in error attempts to create network namespaces on these
systems.  This change makes NETNS_RUN_DIR configurable at build time
by allowing to pass environment variable to make command.
Also, this change makes /etc/netns directory configurable through
NETNS_ETC_DIR environment variable.

For example: ./configure && NETNS_RUN_DIR=/mnt/vendor/netns make

Tested: verified that iproute2 with configuration mentioned above
creates namespaces in /mnt/vendor/netns

Signed-off-by: Pavel Maltsev <pavelm@google.com>
---
 Makefile            | 6 +++++-
 include/namespace.h | 5 +++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b526d3b5..651d2a50 100644
--- a/Makefile
+++ b/Makefile
@@ -16,6 +16,8 @@ PREFIX?=/usr
 LIBDIR?=$(PREFIX)/lib
 SBINDIR?=/sbin
 CONFDIR?=/etc/iproute2
+NETNS_RUN_DIR?=/var/run/netns
+NETNS_ETC_DIR?=/etc/netns
 DATADIR?=$(PREFIX)/share
 HDRDIR?=$(PREFIX)/include/iproute2
 DOCDIR?=$(DATADIR)/doc/iproute2
@@ -34,7 +36,9 @@ ifneq ($(SHARED_LIBS),y)
 DEFINES+= -DNO_SHARED_LIBS
 endif
 
-DEFINES+=-DCONFDIR=\"$(CONFDIR)\"
+DEFINES+=-DCONFDIR=\"$(CONFDIR)\" \
+         -DNETNS_RUN_DIR=\"$(NETNS_RUN_DIR)\" \
+         -DNETNS_ETC_DIR=\"$(NETNS_ETC_DIR)\"
 
 #options for decnet
 ADDLIB+=dnet_ntop.o dnet_pton.o
diff --git a/include/namespace.h b/include/namespace.h
index aed7ce08..e47f9b5d 100644
--- a/include/namespace.h
+++ b/include/namespace.h
@@ -8,8 +8,13 @@
 #include <sys/syscall.h>
 #include <errno.h>
 
+#ifndef NETNS_RUN_DIR
 #define NETNS_RUN_DIR "/var/run/netns"
+#endif
+
+#ifndef NETNS_ETC_DIR
 #define NETNS_ETC_DIR "/etc/netns"
+#endif
 
 #ifndef CLONE_NEWNET
 #define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related

* Re: [PATCH net-next v2 1/3] net: ethernet: ti: Allow most drivers with COMPILE_TEST
From: kbuild test robot @ 2018-05-18 22:32 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: kbuild-all, netdev, fugang.duan, Florian Fainelli,
	David S. Miller, Andrew Lunn, open list
In-Reply-To: <20180516185258.20508-2-f.fainelli@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

Hi Florian,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Florian-Fainelli/net-ethernet-ti-Allow-most-drivers-with-COMPILE_TEST/20180519-043005
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
   `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
   `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
   `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
   drivers/net/ethernet/ti/netcp_core.o: In function `netcp_txpipe_open':
>> netcp_core.c:(.text+0xc84): undefined reference to `knav_queue_open'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 53402 bytes --]

^ permalink raw reply

* [PATCH v2] selftests: net: reuseport_bpf_numa: don't fail if no numa support
From: Anders Roxell @ 2018-05-18 22:27 UTC (permalink / raw)
  To: davem, shuah; +Cc: netdev, linux-kselftest, linux-kernel, Anders Roxell
In-Reply-To: <20180306151004.31336-1-anders.roxell@linaro.org>

The reuseport_bpf_numa test case fails there's no numa support.  The
test shouldn't fail if there's no support it should be skipped.

Fixes: 3c2c3c16aaf6 ("reuseport, bpf: add test case for bpf_get_numa_node_id")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
 tools/testing/selftests/net/reuseport_bpf_numa.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/reuseport_bpf_numa.c b/tools/testing/selftests/net/reuseport_bpf_numa.c
index 365c32e84189..c9f478b40996 100644
--- a/tools/testing/selftests/net/reuseport_bpf_numa.c
+++ b/tools/testing/selftests/net/reuseport_bpf_numa.c
@@ -23,6 +23,8 @@
 #include <unistd.h>
 #include <numa.h>
 
+#include "../kselftest.h"
+
 static const int PORT = 8888;
 
 static void build_rcv_group(int *rcv_fd, size_t len, int family, int proto)
@@ -229,7 +231,7 @@ int main(void)
 	int *rcv_fd, nodes;
 
 	if (numa_available() < 0)
-		error(1, errno, "no numa api support");
+		ksft_exit_skip("no numa api support\n");
 
 	nodes = numa_max_node() + 1;
 
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand
From: Y Song @ 2018-05-18 22:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Yonghong Song, Quentin Monnet, peterz, Alexei Starovoitov,
	Daniel Borkmann, netdev, kernel-team
In-Reply-To: <20180518135127.1f886b67@cakuba>

On Fri, May 18, 2018 at 1:51 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Thu, 17 May 2018 22:03:10 -0700, Yonghong Song wrote:
>> The new command "bpftool perf [show | list]" will traverse
>> all processes under /proc, and if any fd is associated
>> with a perf event, it will print out related perf event
>> information. Documentation is also added.
>
> Thanks for the changes, it looks good with some minor nits which can be
> addressed as follow up if there is no other need to respin.  Please
> consider it:
>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Most likely will need respin. Will make suggested changes then.

>
>> Below is an example to show the results using bcc commands.
>> Running the following 4 bcc commands:
>>   kprobe:     trace.py '__x64_sys_nanosleep'
>>   kretprobe:  trace.py 'r::__x64_sys_nanosleep'
>>   tracepoint: trace.py 't:syscalls:sys_enter_nanosleep'
>>   uprobe:     trace.py 'p:/home/yhs/a.out:main'
>>
>> The bpftool command line and result:
>>
>>   $ bpftool perf
>>   pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
>>   pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
>>   pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
>>   pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159
>>
>>   $ bpftool -j perf
>>   {"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
>>   {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
>>   {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
>>   {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}
>
> nit: this is now an array

Sorry, this is probably updated in middle of work. Will make the change in
the next revision.

>
>>   $ bpftool prog
>>   5: kprobe  name probe___x64_sys  tag e495a0c82f2c7a8d  gpl
>>         loaded_at 2018-05-15T04:46:37-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 4
>>   7: kprobe  name probe___x64_sys  tag f2fdee479a503abf  gpl
>>         loaded_at 2018-05-15T04:48:32-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 7
>>   8: tracepoint  name tracepoint__sys  tag 5390badef2395fcf  gpl
>>         loaded_at 2018-05-15T04:48:48-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 8
>>   9: kprobe  name probe_main_1  tag 0a87bdc2e2953b6d  gpl
>>         loaded_at 2018-05-15T04:49:52-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 9
>>
>>   $ ps ax | grep "python ./trace.py"
>>   21711 pts/0    T      0:03 python ./trace.py __x64_sys_write
>>   21765 pts/0    S+     0:00 python ./trace.py r::__x64_sys_nanosleep
>>   21767 pts/2    S+     0:00 python ./trace.py t:syscalls:sys_enter_nanosleep
>>   21800 pts/3    S+     0:00 python ./trace.py p:/home/yhs/a.out:main
>>   22374 pts/1    S+     0:00 grep --color=auto python ./trace.py
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>
>> diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
>> index b301c9b..3680ad4 100644
>> --- a/tools/bpf/bpftool/bash-completion/bpftool
>> +++ b/tools/bpf/bpftool/bash-completion/bpftool
>> @@ -448,6 +448,15 @@ _bpftool()
>>                      ;;
>>              esac
>>              ;;
>> +        cgroup)
>
> s/cgroup/perf/ :)

A mistake in my side to consolidate different version of code.
I did have "perf" in one of my versions and tested it properly.

>
>> +            case $command in
>> +                *)
>> +                    [[ $prev == $object ]] && \
>> +                        COMPREPLY=( $( compgen -W 'help \
>> +                            show list' -- "$cur" ) )
>> +                    ;;
>> +            esac
>> +            ;;
>>      esac
>>  } &&
>>  complete -F _bpftool bpftool
>
>> +static int show_proc(const char *fpath, const struct stat *sb,
>> +                  int tflag, struct FTW *ftwbuf)
>> +{
>> +     __u64 probe_offset, probe_addr;
>> +     __u32 prog_id, attach_info;
>> +     int err, pid = 0, fd = 0;
>> +     const char *pch;
>> +     char buf[4096];
>> +
>> +     /* prefix always /proc */
>> +     pch = fpath + 5;
>> +     if (*pch == '\0')
>> +             return 0;
>> +
>> +     /* pid should be all numbers */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             pid = pid * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd directory */
>> +     pch++;
>> +     if (strncmp(pch, "fd", 2))
>> +             return FTW_SKIP_SUBTREE;
>> +     pch += 2;
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd/<fd_num> */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             fd = fd * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch != '\0')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* query (pid, fd) for potential perf events */
>> +     err = bpf_task_fd_query(pid, fd, 0, buf, sizeof(buf), &prog_id,
>> +                             &attach_info, &probe_offset, &probe_addr);
>> +     if (err < 0)
>> +             return 0;
>
> nit: it could be nice from user perspective to detect whether kernel
>      supports the command and fail if not.  Otherwise user is not sure
>      if there is no output because kernel lacks support or because
>      there were really no attached progs.  Just a thought, not really
>      a requirement.

I agree with you. it is good to output an error if the kernel does not
support the syscall (e.g., either non-root or new subcommand is not
supported).

>
>> +     if (json_output)
>> +             print_perf_json(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                             probe_addr);
>> +     else
>> +             print_perf_plain(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                              probe_addr);
>> +
>> +     return 0;
>> +}
>> +
>> +static int do_show(int argc, char **argv)
>> +{
>> +     int err = 0, nopenfd = 16;
>> +     int flags = FTW_ACTIONRETVAL | FTW_PHYS;
>
> nit: reverse xmas tree

Will make the change in the next revision.

>
>> +     if (json_output)
>> +             jsonw_start_array(json_wtr);
>> +     if (nftw("/proc", show_proc, nopenfd, flags) == -1) {
>> +             p_err("%s", strerror(errno));
>> +             err = -1;
>> +     }
>> +     if (json_output)
>> +             jsonw_end_array(json_wtr);
>> +
>> +     return err;
>> +}

^ permalink raw reply

* Re: [PATCH iproute2] ip link: Do not call ll_name_to_index when creating a new link
From: Stephen Hemminger @ 2018-05-18 22:08 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev
In-Reply-To: <be2fd550-2276-9127-b481-9b8062e31b96@gmail.com>

On Thu, 17 May 2018 18:17:12 -0600
David Ahern <dsahern@gmail.com> wrote:

> On 5/17/18 4:36 PM, Stephen Hemminger wrote:
> > On Thu, 17 May 2018 16:22:37 -0600
> > dsahern@kernel.org wrote:
> >   
> >> From: David Ahern <dsahern@gmail.com>
> >>
> >> Using iproute2 to create a bridge and add 4094 vlans to it can take from
> >> 2 to 3 *minutes*. The reason is the extraneous call to ll_name_to_index.
> >> ll_name_to_index results in an ioctl(SIOCGIFINDEX) call which in turn
> >> invokes dev_load. If the index does not exist, which it won't when
> >> creating a new link, dev_load calls modprobe twice -- once for
> >> netdev-NAME and again for NAME. This is unnecessary overhead for each
> >> link create.
> >>
> >> When ip link is invoked for a new device, there is no reason to
> >> call ll_name_to_index for the new device. With this patch, creating
> >> a bridge and adding 4094 vlans takes less than 3 *seconds*.
> >>
> >> Signed-off-by: David Ahern <dsahern@gmail.com>  
> > 
> > Yes this looks like a real problem.
> > Isn't the cache supposed to reduce this?
> > 
> > Don't like to make lots of special case flags.
> >   
> 
> The device does not exist, so it won't be in any cache. ll_name_to_index
> already checks it though before calling if_nametoindex.

Good point, I just don't like adding more conditional paths in a function
it is a common source of errors.

What about just pushing the lookup down to the leaf functions that need it?

diff --git a/ip/ip_common.h b/ip/ip_common.h
index 1b89795caa58..49eb7d7bed40 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -36,7 +36,7 @@ int print_addrlabel(const struct sockaddr_nl *who,
 int print_neigh(const struct sockaddr_nl *who,
 		struct nlmsghdr *n, void *arg);
 int ipaddr_list_link(int argc, char **argv);
-void ipaddr_get_vf_rate(int, int *, int *, int);
+void ipaddr_get_vf_rate(int, int *, int *, const char *);
 void iplink_usage(void) __attribute__((noreturn));
 
 void iproute_reset_filter(int ifindex);
@@ -145,7 +145,7 @@ int lwt_parse_encap(struct rtattr *rta, size_t len, int *argcp, char ***argvp);
 void lwt_print_encap(FILE *fp, struct rtattr *encap_type, struct rtattr *encap);
 
 /* iplink_xdp.c */
-int xdp_parse(int *argc, char ***argv, struct iplink_req *req, __u32 ifindex,
+int xdp_parse(int *argc, char ***argv, struct iplink_req *req, const char *ifname,
 	      bool generic, bool drv, bool offload);
 void xdp_dump(FILE *fp, struct rtattr *tb, bool link, bool details);
 
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 75539e057f6a..00da14c6f97c 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -1967,14 +1967,20 @@ ipaddr_loop_each_vf(struct rtattr *tb[], int vfnum, int *min, int *max)
 	exit(1);
 }
 
-void ipaddr_get_vf_rate(int vfnum, int *min, int *max, int idx)
+void ipaddr_get_vf_rate(int vfnum, int *min, int *max, const char *dev)
 {
 	struct nlmsg_chain linfo = { NULL, NULL};
 	struct rtattr *tb[IFLA_MAX+1];
 	struct ifinfomsg *ifi;
 	struct nlmsg_list *l;
 	struct nlmsghdr *n;
-	int len;
+	int idx, len;
+
+	idx = ll_name_to_index(dev);
+	if (idx == 0) {
+		fprintf(stderr, "Device %s does not exist\n", dev);
+		exit(1);
+	}
 
 	if (rtnl_wilddump_request(&rth, AF_UNSPEC, RTM_GETLINK) < 0) {
 		perror("Cannot send dump request");
diff --git a/ip/iplink.c b/ip/iplink.c
index 22afe0221f3c..9ff5f692a1d4 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -242,9 +242,10 @@ static int iplink_have_newlink(void)
 }
 #endif /* ! IPLINK_IOCTL_COMPAT */
 
-static int nl_get_ll_addr_len(unsigned int dev_index)
+static int nl_get_ll_addr_len(const char *ifname)
 {
 	int len;
+	int dev_index = ll_name_to_index(ifname);
 	struct iplink_req req = {
 		.n = {
 			.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
@@ -259,6 +260,9 @@ static int nl_get_ll_addr_len(unsigned int dev_index)
 	struct nlmsghdr *answer;
 	struct rtattr *tb[IFLA_MAX+1];
 
+	if (dev_index == 0)
+		return -1;
+
 	if (rtnl_talk(&rth, &req.n, &answer) < 0)
 		return -1;
 
@@ -337,7 +341,7 @@ static void iplink_parse_vf_vlan_info(int vf, int *argcp, char ***argvp,
 }
 
 static int iplink_parse_vf(int vf, int *argcp, char ***argvp,
-			   struct iplink_req *req, int dev_index)
+			   struct iplink_req *req, const char *dev)
 {
 	char new_rate_api = 0, count = 0, override_legacy_rate = 0;
 	struct ifla_vf_rate tivt;
@@ -373,7 +377,7 @@ static int iplink_parse_vf(int vf, int *argcp, char ***argvp,
 		NEXT_ARG();
 		if (matches(*argv, "mac") == 0) {
 			struct ifla_vf_mac ivm = { 0 };
-			int halen = nl_get_ll_addr_len(dev_index);
+			int halen = nl_get_ll_addr_len(dev);
 
 			NEXT_ARG();
 			ivm.vf = vf;
@@ -542,7 +546,7 @@ static int iplink_parse_vf(int vf, int *argcp, char ***argvp,
 		int tmin, tmax;
 
 		if (tivt.min_tx_rate == -1 || tivt.max_tx_rate == -1) {
-			ipaddr_get_vf_rate(tivt.vf, &tmin, &tmax, dev_index);
+			ipaddr_get_vf_rate(tivt.vf, &tmin, &tmax, dev);
 			if (tivt.min_tx_rate == -1)
 				tivt.min_tx_rate = tmin;
 			if (tivt.max_tx_rate == -1)
@@ -583,7 +587,6 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 	int vf = -1;
 	int numtxqueues = -1;
 	int numrxqueues = -1;
-	int dev_index = 0;
 	int link_netnsid = -1;
 	int index = 0;
 	int group = -1;
@@ -605,10 +608,8 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 			if (check_ifname(*argv))
 				invarg("\"name\" not a valid ifname", *argv);
 			name = *argv;
-			if (!dev) {
+			if (!dev)
 				dev = name;
-				dev_index = ll_name_to_index(dev);
-			}
 		} else if (strcmp(*argv, "index") == 0) {
 			NEXT_ARG();
 			if (index)
@@ -660,7 +661,7 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 			bool offload = strcmp(*argv, "xdpoffload") == 0;
 
 			NEXT_ARG();
-			if (xdp_parse(&argc, &argv, req, dev_index,
+			if (xdp_parse(&argc, &argv, req, dev,
 				      generic, drv, offload))
 				exit(-1);
 
@@ -750,10 +751,10 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 
 			vflist = addattr_nest(&req->n, sizeof(*req),
 					      IFLA_VFINFO_LIST);
-			if (dev_index == 0)
+			if (!dev)
 				missarg("dev");
 
-			len = iplink_parse_vf(vf, &argc, &argv, req, dev_index);
+			len = iplink_parse_vf(vf, &argc, &argv, req, dev);
 			if (len < 0)
 				return -1;
 			addattr_nest_end(&req->n, vflist);
@@ -916,7 +917,6 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 			if (check_ifname(*argv))
 				invarg("\"dev\" not a valid ifname", *argv);
 			dev = *argv;
-			dev_index = ll_name_to_index(dev);
 		}
 		argc--; argv++;
 	}
@@ -931,8 +931,8 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, char **type)
 	else if (!strcmp(name, dev))
 		name = dev;
 
-	if (dev_index && addr_len) {
-		int halen = nl_get_ll_addr_len(dev_index);
+	if (dev && addr_len) {
+		int halen = nl_get_ll_addr_len(dev);
 
 		if (halen >= 0 && halen != addr_len) {
 			fprintf(stderr,
diff --git a/ip/iplink_xdp.c b/ip/iplink_xdp.c
index 83826358aa22..dd4fd1fd3a3b 100644
--- a/ip/iplink_xdp.c
+++ b/ip/iplink_xdp.c
@@ -48,8 +48,8 @@ static int xdp_delete(struct xdp_req *xdp)
 	return 0;
 }
 
-int xdp_parse(int *argc, char ***argv, struct iplink_req *req, __u32 ifindex,
-	      bool generic, bool drv, bool offload)
+int xdp_parse(int *argc, char ***argv, struct iplink_req *req,
+	      const char *ifname, bool generic, bool drv, bool offload)
 {
 	struct bpf_cfg_in cfg = {
 		.type = BPF_PROG_TYPE_XDP,
@@ -61,6 +61,8 @@ int xdp_parse(int *argc, char ***argv, struct iplink_req *req, __u32 ifindex,
 	};
 
 	if (offload) {
+		int ifindex = ll_name_to_index(ifname);
+
 		if (!ifindex)
 			incomplete_command();
 		cfg.ifindex = ifindex;

^ permalink raw reply related

* Re: [PATCH net-next v2 1/3] net: ethernet: ti: Allow most drivers with COMPILE_TEST
From: kbuild test robot @ 2018-05-18 21:54 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: kbuild-all, netdev, fugang.duan, Florian Fainelli,
	David S. Miller, Andrew Lunn, open list
In-Reply-To: <20180516185258.20508-2-f.fainelli@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 19228 bytes --]

Hi Florian,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Florian-Fainelli/net-ethernet-ti-Allow-most-drivers-with-COMPILE_TEST/20180519-043005
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=ia64 

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_free_rx_desc_chain':
>> drivers/net/ethernet/ti/netcp_core.c:613:13: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      buf_ptr = (void *)GET_SW_DATA0(ndesc);
                ^
   drivers/net/ethernet/ti/netcp_core.c:622:12: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     buf_ptr = (void *)GET_SW_DATA0(desc);
               ^
   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_process_one_rx_packet':
   drivers/net/ethernet/ti/netcp_core.c:681:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     org_buf_ptr = (void *)GET_SW_DATA0(desc);
                   ^
   drivers/net/ethernet/ti/netcp_core.c:718:10: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      page = (struct page *)GET_SW_DATA0(ndesc);
             ^
   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_free_rx_buf':
   drivers/net/ethernet/ti/netcp_core.c:822:13: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      buf_ptr = (void *)GET_SW_DATA0(desc);
                ^
   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_allocate_rx_buf':
>> drivers/net/ethernet/ti/netcp_core.c:906:16: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      sw_data[0] = (u32)bufptr;
                   ^
   drivers/net/ethernet/ti/netcp_core.c:919:16: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      sw_data[0] = (u32)page;
                   ^
   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_process_tx_compl_packets':
   drivers/net/ethernet/ti/netcp_core.c:1041:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      skb = (struct sk_buff *)GET_SW_DATA0(desc);
            ^
   drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_tx_submit_skb':
   drivers/net/ethernet/ti/netcp_core.c:1256:15: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     SET_SW_DATA0((u32)skb, desc);
                  ^
   drivers/net/ethernet/ti/netcp_core.c:181:49: note: in definition of macro 'SET_SW_DATA0'
    #define SET_SW_DATA0(data, desc) set_sw_data(0, data, desc)
                                                    ^~~~

vim +613 drivers/net/ethernet/ti/netcp_core.c

84640e27 Karicheri, Muralidharan 2015-01-15  591  
84640e27 Karicheri, Muralidharan 2015-01-15  592  static void netcp_free_rx_desc_chain(struct netcp_intf *netcp,
84640e27 Karicheri, Muralidharan 2015-01-15  593  				     struct knav_dma_desc *desc)
84640e27 Karicheri, Muralidharan 2015-01-15  594  {
84640e27 Karicheri, Muralidharan 2015-01-15  595  	struct knav_dma_desc *ndesc;
84640e27 Karicheri, Muralidharan 2015-01-15  596  	dma_addr_t dma_desc, dma_buf;
84640e27 Karicheri, Muralidharan 2015-01-15  597  	unsigned int buf_len, dma_sz = sizeof(*ndesc);
84640e27 Karicheri, Muralidharan 2015-01-15  598  	void *buf_ptr;
958d104e Arnd Bergmann           2015-12-18  599  	u32 tmp;
84640e27 Karicheri, Muralidharan 2015-01-15  600  
84640e27 Karicheri, Muralidharan 2015-01-15  601  	get_words(&dma_desc, 1, &desc->next_desc);
84640e27 Karicheri, Muralidharan 2015-01-15  602  
84640e27 Karicheri, Muralidharan 2015-01-15  603  	while (dma_desc) {
84640e27 Karicheri, Muralidharan 2015-01-15  604  		ndesc = knav_pool_desc_unmap(netcp->rx_pool, dma_desc, dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  605  		if (unlikely(!ndesc)) {
84640e27 Karicheri, Muralidharan 2015-01-15  606  			dev_err(netcp->ndev_dev, "failed to unmap Rx desc\n");
84640e27 Karicheri, Muralidharan 2015-01-15  607  			break;
84640e27 Karicheri, Muralidharan 2015-01-15  608  		}
958d104e Arnd Bergmann           2015-12-18  609  		get_pkt_info(&dma_buf, &tmp, &dma_desc, ndesc);
06324481 Karicheri, Muralidharan 2016-02-19  610  		/* warning!!!! We are retrieving the virtual ptr in the sw_data
06324481 Karicheri, Muralidharan 2016-02-19  611  		 * field as a 32bit value. Will not work on 64bit machines
06324481 Karicheri, Muralidharan 2016-02-19  612  		 */
06324481 Karicheri, Muralidharan 2016-02-19 @613  		buf_ptr = (void *)GET_SW_DATA0(ndesc);
06324481 Karicheri, Muralidharan 2016-02-19  614  		buf_len = (int)GET_SW_DATA1(desc);
84640e27 Karicheri, Muralidharan 2015-01-15  615  		dma_unmap_page(netcp->dev, dma_buf, PAGE_SIZE, DMA_FROM_DEVICE);
84640e27 Karicheri, Muralidharan 2015-01-15  616  		__free_page(buf_ptr);
84640e27 Karicheri, Muralidharan 2015-01-15  617  		knav_pool_desc_put(netcp->rx_pool, desc);
84640e27 Karicheri, Muralidharan 2015-01-15  618  	}
06324481 Karicheri, Muralidharan 2016-02-19  619  	/* warning!!!! We are retrieving the virtual ptr in the sw_data
06324481 Karicheri, Muralidharan 2016-02-19  620  	 * field as a 32bit value. Will not work on 64bit machines
06324481 Karicheri, Muralidharan 2016-02-19  621  	 */
06324481 Karicheri, Muralidharan 2016-02-19  622  	buf_ptr = (void *)GET_SW_DATA0(desc);
06324481 Karicheri, Muralidharan 2016-02-19  623  	buf_len = (int)GET_SW_DATA1(desc);
89907779 Arnd Bergmann           2015-12-08  624  
84640e27 Karicheri, Muralidharan 2015-01-15  625  	if (buf_ptr)
84640e27 Karicheri, Muralidharan 2015-01-15  626  		netcp_frag_free(buf_len <= PAGE_SIZE, buf_ptr);
84640e27 Karicheri, Muralidharan 2015-01-15  627  	knav_pool_desc_put(netcp->rx_pool, desc);
84640e27 Karicheri, Muralidharan 2015-01-15  628  }
84640e27 Karicheri, Muralidharan 2015-01-15  629  
84640e27 Karicheri, Muralidharan 2015-01-15  630  static void netcp_empty_rx_queue(struct netcp_intf *netcp)
84640e27 Karicheri, Muralidharan 2015-01-15  631  {
6a8162e9 Michael Scherban        2017-01-06  632  	struct netcp_stats *rx_stats = &netcp->stats;
84640e27 Karicheri, Muralidharan 2015-01-15  633  	struct knav_dma_desc *desc;
84640e27 Karicheri, Muralidharan 2015-01-15  634  	unsigned int dma_sz;
84640e27 Karicheri, Muralidharan 2015-01-15  635  	dma_addr_t dma;
84640e27 Karicheri, Muralidharan 2015-01-15  636  
84640e27 Karicheri, Muralidharan 2015-01-15  637  	for (; ;) {
84640e27 Karicheri, Muralidharan 2015-01-15  638  		dma = knav_queue_pop(netcp->rx_queue, &dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  639  		if (!dma)
84640e27 Karicheri, Muralidharan 2015-01-15  640  			break;
84640e27 Karicheri, Muralidharan 2015-01-15  641  
84640e27 Karicheri, Muralidharan 2015-01-15  642  		desc = knav_pool_desc_unmap(netcp->rx_pool, dma, dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  643  		if (unlikely(!desc)) {
84640e27 Karicheri, Muralidharan 2015-01-15  644  			dev_err(netcp->ndev_dev, "%s: failed to unmap Rx desc\n",
84640e27 Karicheri, Muralidharan 2015-01-15  645  				__func__);
6a8162e9 Michael Scherban        2017-01-06  646  			rx_stats->rx_errors++;
84640e27 Karicheri, Muralidharan 2015-01-15  647  			continue;
84640e27 Karicheri, Muralidharan 2015-01-15  648  		}
84640e27 Karicheri, Muralidharan 2015-01-15  649  		netcp_free_rx_desc_chain(netcp, desc);
6a8162e9 Michael Scherban        2017-01-06  650  		rx_stats->rx_dropped++;
84640e27 Karicheri, Muralidharan 2015-01-15  651  	}
84640e27 Karicheri, Muralidharan 2015-01-15  652  }
84640e27 Karicheri, Muralidharan 2015-01-15  653  
84640e27 Karicheri, Muralidharan 2015-01-15  654  static int netcp_process_one_rx_packet(struct netcp_intf *netcp)
84640e27 Karicheri, Muralidharan 2015-01-15  655  {
6a8162e9 Michael Scherban        2017-01-06  656  	struct netcp_stats *rx_stats = &netcp->stats;
84640e27 Karicheri, Muralidharan 2015-01-15  657  	unsigned int dma_sz, buf_len, org_buf_len;
84640e27 Karicheri, Muralidharan 2015-01-15  658  	struct knav_dma_desc *desc, *ndesc;
84640e27 Karicheri, Muralidharan 2015-01-15  659  	unsigned int pkt_sz = 0, accum_sz;
84640e27 Karicheri, Muralidharan 2015-01-15  660  	struct netcp_hook_list *rx_hook;
84640e27 Karicheri, Muralidharan 2015-01-15  661  	dma_addr_t dma_desc, dma_buff;
84640e27 Karicheri, Muralidharan 2015-01-15  662  	struct netcp_packet p_info;
84640e27 Karicheri, Muralidharan 2015-01-15  663  	struct sk_buff *skb;
84640e27 Karicheri, Muralidharan 2015-01-15  664  	void *org_buf_ptr;
69d707d0 Karicheri, Muralidharan 2017-01-06  665  	u32 tmp;
84640e27 Karicheri, Muralidharan 2015-01-15  666  
84640e27 Karicheri, Muralidharan 2015-01-15  667  	dma_desc = knav_queue_pop(netcp->rx_queue, &dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  668  	if (!dma_desc)
84640e27 Karicheri, Muralidharan 2015-01-15  669  		return -1;
84640e27 Karicheri, Muralidharan 2015-01-15  670  
84640e27 Karicheri, Muralidharan 2015-01-15  671  	desc = knav_pool_desc_unmap(netcp->rx_pool, dma_desc, dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  672  	if (unlikely(!desc)) {
84640e27 Karicheri, Muralidharan 2015-01-15  673  		dev_err(netcp->ndev_dev, "failed to unmap Rx desc\n");
84640e27 Karicheri, Muralidharan 2015-01-15  674  		return 0;
84640e27 Karicheri, Muralidharan 2015-01-15  675  	}
84640e27 Karicheri, Muralidharan 2015-01-15  676  
84640e27 Karicheri, Muralidharan 2015-01-15  677  	get_pkt_info(&dma_buff, &buf_len, &dma_desc, desc);
06324481 Karicheri, Muralidharan 2016-02-19  678  	/* warning!!!! We are retrieving the virtual ptr in the sw_data
06324481 Karicheri, Muralidharan 2016-02-19  679  	 * field as a 32bit value. Will not work on 64bit machines
06324481 Karicheri, Muralidharan 2016-02-19  680  	 */
06324481 Karicheri, Muralidharan 2016-02-19  681  	org_buf_ptr = (void *)GET_SW_DATA0(desc);
06324481 Karicheri, Muralidharan 2016-02-19  682  	org_buf_len = (int)GET_SW_DATA1(desc);
84640e27 Karicheri, Muralidharan 2015-01-15  683  
84640e27 Karicheri, Muralidharan 2015-01-15  684  	if (unlikely(!org_buf_ptr)) {
84640e27 Karicheri, Muralidharan 2015-01-15  685  		dev_err(netcp->ndev_dev, "NULL bufptr in desc\n");
84640e27 Karicheri, Muralidharan 2015-01-15  686  		goto free_desc;
84640e27 Karicheri, Muralidharan 2015-01-15  687  	}
84640e27 Karicheri, Muralidharan 2015-01-15  688  
84640e27 Karicheri, Muralidharan 2015-01-15  689  	pkt_sz &= KNAV_DMA_DESC_PKT_LEN_MASK;
84640e27 Karicheri, Muralidharan 2015-01-15  690  	accum_sz = buf_len;
84640e27 Karicheri, Muralidharan 2015-01-15  691  	dma_unmap_single(netcp->dev, dma_buff, buf_len, DMA_FROM_DEVICE);
84640e27 Karicheri, Muralidharan 2015-01-15  692  
84640e27 Karicheri, Muralidharan 2015-01-15  693  	/* Build a new sk_buff for the primary buffer */
84640e27 Karicheri, Muralidharan 2015-01-15  694  	skb = build_skb(org_buf_ptr, org_buf_len);
84640e27 Karicheri, Muralidharan 2015-01-15  695  	if (unlikely(!skb)) {
84640e27 Karicheri, Muralidharan 2015-01-15  696  		dev_err(netcp->ndev_dev, "build_skb() failed\n");
84640e27 Karicheri, Muralidharan 2015-01-15  697  		goto free_desc;
84640e27 Karicheri, Muralidharan 2015-01-15  698  	}
84640e27 Karicheri, Muralidharan 2015-01-15  699  
84640e27 Karicheri, Muralidharan 2015-01-15  700  	/* update data, tail and len */
84640e27 Karicheri, Muralidharan 2015-01-15  701  	skb_reserve(skb, NETCP_SOP_OFFSET);
84640e27 Karicheri, Muralidharan 2015-01-15  702  	__skb_put(skb, buf_len);
84640e27 Karicheri, Muralidharan 2015-01-15  703  
84640e27 Karicheri, Muralidharan 2015-01-15  704  	/* Fill in the page fragment list */
84640e27 Karicheri, Muralidharan 2015-01-15  705  	while (dma_desc) {
84640e27 Karicheri, Muralidharan 2015-01-15  706  		struct page *page;
84640e27 Karicheri, Muralidharan 2015-01-15  707  
84640e27 Karicheri, Muralidharan 2015-01-15  708  		ndesc = knav_pool_desc_unmap(netcp->rx_pool, dma_desc, dma_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  709  		if (unlikely(!ndesc)) {
84640e27 Karicheri, Muralidharan 2015-01-15  710  			dev_err(netcp->ndev_dev, "failed to unmap Rx desc\n");
84640e27 Karicheri, Muralidharan 2015-01-15  711  			goto free_desc;
84640e27 Karicheri, Muralidharan 2015-01-15  712  		}
84640e27 Karicheri, Muralidharan 2015-01-15  713  
84640e27 Karicheri, Muralidharan 2015-01-15  714  		get_pkt_info(&dma_buff, &buf_len, &dma_desc, ndesc);
06324481 Karicheri, Muralidharan 2016-02-19  715  		/* warning!!!! We are retrieving the virtual ptr in the sw_data
06324481 Karicheri, Muralidharan 2016-02-19  716  		 * field as a 32bit value. Will not work on 64bit machines
06324481 Karicheri, Muralidharan 2016-02-19  717  		 */
5a717843 Rex Chang               2018-01-16 @718  		page = (struct page *)GET_SW_DATA0(ndesc);
84640e27 Karicheri, Muralidharan 2015-01-15  719  
84640e27 Karicheri, Muralidharan 2015-01-15  720  		if (likely(dma_buff && buf_len && page)) {
84640e27 Karicheri, Muralidharan 2015-01-15  721  			dma_unmap_page(netcp->dev, dma_buff, PAGE_SIZE,
84640e27 Karicheri, Muralidharan 2015-01-15  722  				       DMA_FROM_DEVICE);
84640e27 Karicheri, Muralidharan 2015-01-15  723  		} else {
89907779 Arnd Bergmann           2015-12-08  724  			dev_err(netcp->ndev_dev, "Bad Rx desc dma_buff(%pad), len(%d), page(%p)\n",
89907779 Arnd Bergmann           2015-12-08  725  				&dma_buff, buf_len, page);
84640e27 Karicheri, Muralidharan 2015-01-15  726  			goto free_desc;
84640e27 Karicheri, Muralidharan 2015-01-15  727  		}
84640e27 Karicheri, Muralidharan 2015-01-15  728  
84640e27 Karicheri, Muralidharan 2015-01-15  729  		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
84640e27 Karicheri, Muralidharan 2015-01-15  730  				offset_in_page(dma_buff), buf_len, PAGE_SIZE);
84640e27 Karicheri, Muralidharan 2015-01-15  731  		accum_sz += buf_len;
84640e27 Karicheri, Muralidharan 2015-01-15  732  
84640e27 Karicheri, Muralidharan 2015-01-15  733  		/* Free the descriptor */
84640e27 Karicheri, Muralidharan 2015-01-15  734  		knav_pool_desc_put(netcp->rx_pool, ndesc);
84640e27 Karicheri, Muralidharan 2015-01-15  735  	}
84640e27 Karicheri, Muralidharan 2015-01-15  736  
84640e27 Karicheri, Muralidharan 2015-01-15  737  	/* check for packet len and warn */
84640e27 Karicheri, Muralidharan 2015-01-15  738  	if (unlikely(pkt_sz != accum_sz))
84640e27 Karicheri, Muralidharan 2015-01-15  739  		dev_dbg(netcp->ndev_dev, "mismatch in packet size(%d) & sum of fragments(%d)\n",
84640e27 Karicheri, Muralidharan 2015-01-15  740  			pkt_sz, accum_sz);
84640e27 Karicheri, Muralidharan 2015-01-15  741  
4cd85a61 Karicheri, Muralidharan 2017-01-06  742  	/* Newer version of the Ethernet switch can trim the Ethernet FCS
4cd85a61 Karicheri, Muralidharan 2017-01-06  743  	 * from the packet and is indicated in hw_cap. So trim it only for
4cd85a61 Karicheri, Muralidharan 2017-01-06  744  	 * older h/w
4cd85a61 Karicheri, Muralidharan 2017-01-06  745  	 */
4cd85a61 Karicheri, Muralidharan 2017-01-06  746  	if (!(netcp->hw_cap & ETH_SW_CAN_REMOVE_ETH_FCS))
84640e27 Karicheri, Muralidharan 2015-01-15  747  		__pskb_trim(skb, skb->len - ETH_FCS_LEN);
84640e27 Karicheri, Muralidharan 2015-01-15  748  
84640e27 Karicheri, Muralidharan 2015-01-15  749  	/* Call each of the RX hooks */
84640e27 Karicheri, Muralidharan 2015-01-15  750  	p_info.skb = skb;
6246168b WingMan Kwok            2016-12-08  751  	skb->dev = netcp->ndev;
84640e27 Karicheri, Muralidharan 2015-01-15  752  	p_info.rxtstamp_complete = false;
69d707d0 Karicheri, Muralidharan 2017-01-06  753  	get_desc_info(&tmp, &p_info.eflags, desc);
69d707d0 Karicheri, Muralidharan 2017-01-06  754  	p_info.epib = desc->epib;
69d707d0 Karicheri, Muralidharan 2017-01-06  755  	p_info.psdata = (u32 __force *)desc->psdata;
69d707d0 Karicheri, Muralidharan 2017-01-06  756  	p_info.eflags = ((p_info.eflags >> KNAV_DMA_DESC_EFLAGS_SHIFT) &
69d707d0 Karicheri, Muralidharan 2017-01-06  757  			 KNAV_DMA_DESC_EFLAGS_MASK);
84640e27 Karicheri, Muralidharan 2015-01-15  758  	list_for_each_entry(rx_hook, &netcp->rxhook_list_head, list) {
84640e27 Karicheri, Muralidharan 2015-01-15  759  		int ret;
84640e27 Karicheri, Muralidharan 2015-01-15  760  
84640e27 Karicheri, Muralidharan 2015-01-15  761  		ret = rx_hook->hook_rtn(rx_hook->order, rx_hook->hook_data,
84640e27 Karicheri, Muralidharan 2015-01-15  762  					&p_info);
84640e27 Karicheri, Muralidharan 2015-01-15  763  		if (unlikely(ret)) {
84640e27 Karicheri, Muralidharan 2015-01-15  764  			dev_err(netcp->ndev_dev, "RX hook %d failed: %d\n",
84640e27 Karicheri, Muralidharan 2015-01-15  765  				rx_hook->order, ret);
69d707d0 Karicheri, Muralidharan 2017-01-06  766  			/* Free the primary descriptor */
6a8162e9 Michael Scherban        2017-01-06  767  			rx_stats->rx_dropped++;
69d707d0 Karicheri, Muralidharan 2017-01-06  768  			knav_pool_desc_put(netcp->rx_pool, desc);
84640e27 Karicheri, Muralidharan 2015-01-15  769  			dev_kfree_skb(skb);
84640e27 Karicheri, Muralidharan 2015-01-15  770  			return 0;
84640e27 Karicheri, Muralidharan 2015-01-15  771  		}
84640e27 Karicheri, Muralidharan 2015-01-15  772  	}
69d707d0 Karicheri, Muralidharan 2017-01-06  773  	/* Free the primary descriptor */
69d707d0 Karicheri, Muralidharan 2017-01-06  774  	knav_pool_desc_put(netcp->rx_pool, desc);
84640e27 Karicheri, Muralidharan 2015-01-15  775  
6a8162e9 Michael Scherban        2017-01-06  776  	u64_stats_update_begin(&rx_stats->syncp_rx);
6a8162e9 Michael Scherban        2017-01-06  777  	rx_stats->rx_packets++;
6a8162e9 Michael Scherban        2017-01-06  778  	rx_stats->rx_bytes += skb->len;
6a8162e9 Michael Scherban        2017-01-06  779  	u64_stats_update_end(&rx_stats->syncp_rx);
84640e27 Karicheri, Muralidharan 2015-01-15  780  
84640e27 Karicheri, Muralidharan 2015-01-15  781  	/* push skb up the stack */
84640e27 Karicheri, Muralidharan 2015-01-15  782  	skb->protocol = eth_type_trans(skb, netcp->ndev);
84640e27 Karicheri, Muralidharan 2015-01-15  783  	netif_receive_skb(skb);
84640e27 Karicheri, Muralidharan 2015-01-15  784  	return 0;
84640e27 Karicheri, Muralidharan 2015-01-15  785  
84640e27 Karicheri, Muralidharan 2015-01-15  786  free_desc:
84640e27 Karicheri, Muralidharan 2015-01-15  787  	netcp_free_rx_desc_chain(netcp, desc);
6a8162e9 Michael Scherban        2017-01-06  788  	rx_stats->rx_errors++;
84640e27 Karicheri, Muralidharan 2015-01-15  789  	return 0;
84640e27 Karicheri, Muralidharan 2015-01-15  790  }
84640e27 Karicheri, Muralidharan 2015-01-15  791  

:::::: The code at line 613 was first introduced by commit
:::::: 0632448134d0ac1450a19d26f90948fde3b558ad net: netcp: rework the code for get/set sw_data in dma desc

:::::: TO: Karicheri, Muralidharan <m-karicheri2@ti.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50018 bytes --]

^ permalink raw reply

* Re: [PATCH iproute2] Allow to configure /var/run/netns directory
From: Stephen Hemminger @ 2018-05-18 21:53 UTC (permalink / raw)
  To: Pavel Maltsev; +Cc: netdev, lorenzo
In-Reply-To: <20180515214946.222797-1-pavelm@google.com>

On Tue, 15 May 2018 14:49:46 -0700
Pavel Maltsev <pavelm@google.com> wrote:

> Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
> However, some systems (e.g. Android) doesn't have /var
> which results in error attempts to create network namespaces on these
> systems.  This change makes NETNS_RUN_DIR configurable at build time
> by allowing to pass environment variable to configre script.
> 
> For example: NETNS_RUN_DIR=/mnt/vendor/netns ./configure && make
> 
> Tested: verified that iproute2 with configuration mentioned above
> creates namespaces in /mnt/vendor/netns
> 
> Signed-off-by: Pavel Maltsev <pavelm@google.com>

The directory path should definitely be overrideable on the build.
The configure script is already messy enough, lets do it instead like
the other runtime directories are already done ARPDDIR and CONFDIR.

Something like?

diff --git a/Makefile b/Makefile
index b526d3b5b5c4..ab828669e711 100644
--- a/Makefile
+++ b/Makefile
@@ -16,6 +16,7 @@ PREFIX?=/usr
 LIBDIR?=$(PREFIX)/lib
 SBINDIR?=/sbin
 CONFDIR?=/etc/iproute2
+NETNS_RUN_DIR?=/var/run/netns
 DATADIR?=$(PREFIX)/share
 HDRDIR?=$(PREFIX)/include/iproute2
 DOCDIR?=$(DATADIR)/doc/iproute2
@@ -34,7 +35,7 @@ ifneq ($(SHARED_LIBS),y)
 DEFINES+= -DNO_SHARED_LIBS
 endif
 
-DEFINES+=-DCONFDIR=\"$(CONFDIR)\"
+DEFINES+=-DCONFDIR=\"$(CONFDIR)\" -DNETNS_RUN_DIR=\"$(NETNS_RUN_DIR)\"
 
 #options for decnet
 ADDLIB+=dnet_ntop.o dnet_pton.o
diff --git a/include/namespace.h b/include/namespace.h
index aed7ce08507f..e47f9b5d49d1 100644
--- a/include/namespace.h
+++ b/include/namespace.h
@@ -8,8 +8,13 @@
 #include <sys/syscall.h>
 #include <errno.h>
 
+#ifndef NETNS_RUN_DIR
 #define NETNS_RUN_DIR "/var/run/netns"
+#endif
+
+#ifndef NETNS_ETC_DIR
 #define NETNS_ETC_DIR "/etc/netns"
+#endif
 
 #ifndef CLONE_NEWNET
 #define CLONE_NEWNET 0x40000000	/* New network namespace (lo, device, names sockets, etc) */

^ permalink raw reply related

* [PATCH v2] isdn: eicon: fix a missing-check bug
From: Wenwen Wang @ 2018-05-18 21:33 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Kangjie Lu, Armin Schindler, Karsten Keil,
	open list:ISDN SUBSYSTEM, open list

In divasmain.c, the function divas_write() firstly invokes the function
diva_xdi_open_adapter() to open the adapter that matches with the adapter
number provided by the user, and then invokes the function diva_xdi_write()
to perform the write operation using the matched adapter. The two functions
diva_xdi_open_adapter() and diva_xdi_write() are located in diva.c.

In diva_xdi_open_adapter(), the user command is copied to the object 'msg'
from the userspace pointer 'src' through the function pointer 'cp_fn',
which eventually calls copy_from_user() to do the copy. Then, the adapter
number 'msg.adapter' is used to find out a matched adapter from the
'adapter_queue'. A matched adapter will be returned if it is found.
Otherwise, NULL is returned to indicate the failure of the verification on
the adapter number.

As mentioned above, if a matched adapter is returned, the function
diva_xdi_write() is invoked to perform the write operation. In this
function, the user command is copied once again from the userspace pointer
'src', which is the same as the 'src' pointer in diva_xdi_open_adapter() as
both of them are from the 'buf' pointer in divas_write(). Similarly, the
copy is achieved through the function pointer 'cp_fn', which finally calls
copy_from_user(). After the successful copy, the corresponding command
processing handler of the matched adapter is invoked to perform the write
operation.

It is obvious that there are two copies here from userspace, one is in
diva_xdi_open_adapter(), and one is in diva_xdi_write(). Plus, both of
these two copies share the same source userspace pointer, i.e., the 'buf'
pointer in divas_write(). Given that a malicious userspace process can race
to change the content pointed by the 'buf' pointer, this can pose potential
security issues. For example, in the first copy, the user provides a valid
adapter number to pass the verification process and a valid adapter can be
found. Then the user can modify the adapter number to an invalid number.
This way, the user can bypass the verification process of the adapter
number and inject inconsistent data.

This patch reuses the data copied in
diva_xdi_open_adapter() and passes it to diva_xdi_write(). This way, the
above issues can be avoided.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
---
 drivers/isdn/hardware/eicon/diva.c      | 20 +++++++++++++-------
 drivers/isdn/hardware/eicon/diva.h      |  5 +++--
 drivers/isdn/hardware/eicon/divasmain.c | 18 +++++++++++-------
 3 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/drivers/isdn/hardware/eicon/diva.c b/drivers/isdn/hardware/eicon/diva.c
index 944a7f3..fa239d8 100644
--- a/drivers/isdn/hardware/eicon/diva.c
+++ b/drivers/isdn/hardware/eicon/diva.c
@@ -388,10 +388,9 @@ void divasa_xdi_driver_unload(void)
 **  Receive and process command from user mode utility
 */
 void *diva_xdi_open_adapter(void *os_handle, const void __user *src,
-			    int length,
+			    int length, diva_xdi_um_cfg_cmd_t *msg,
 			    divas_xdi_copy_from_user_fn_t cp_fn)
 {
-	diva_xdi_um_cfg_cmd_t msg;
 	diva_os_xdi_adapter_t *a = NULL;
 	diva_os_spin_lock_magic_t old_irql;
 	struct list_head *tmp;
@@ -401,21 +400,21 @@ void *diva_xdi_open_adapter(void *os_handle, const void __user *src,
 			 length, sizeof(diva_xdi_um_cfg_cmd_t)))
 			return NULL;
 	}
-	if ((*cp_fn) (os_handle, &msg, src, sizeof(msg)) <= 0) {
+	if ((*cp_fn) (os_handle, msg, src, sizeof(*msg)) <= 0) {
 		DBG_ERR(("A: A(?) open, write error"))
 			return NULL;
 	}
 	diva_os_enter_spin_lock(&adapter_lock, &old_irql, "open_adapter");
 	list_for_each(tmp, &adapter_queue) {
 		a = list_entry(tmp, diva_os_xdi_adapter_t, link);
-		if (a->controller == (int)msg.adapter)
+		if (a->controller == (int)msg->adapter)
 			break;
 		a = NULL;
 	}
 	diva_os_leave_spin_lock(&adapter_lock, &old_irql, "open_adapter");
 
 	if (!a) {
-		DBG_ERR(("A: A(%d) open, adapter not found", msg.adapter))
+		DBG_ERR(("A: A(%d) open, adapter not found", msg->adapter))
 			}
 
 	return (a);
@@ -437,7 +436,8 @@ void diva_xdi_close_adapter(void *adapter, void *os_handle)
 
 int
 diva_xdi_write(void *adapter, void *os_handle, const void __user *src,
-	       int length, divas_xdi_copy_from_user_fn_t cp_fn)
+	       int length, diva_xdi_um_cfg_cmd_t *msg,
+	       divas_xdi_copy_from_user_fn_t cp_fn)
 {
 	diva_os_xdi_adapter_t *a = (diva_os_xdi_adapter_t *) adapter;
 	void *data;
@@ -459,7 +459,13 @@ diva_xdi_write(void *adapter, void *os_handle, const void __user *src,
 			return (-2);
 	}
 
-	length = (*cp_fn) (os_handle, data, src, length);
+	if (msg) {
+		*(diva_xdi_um_cfg_cmd_t *)data = *msg;
+		length = (*cp_fn) (os_handle, (char *)data + sizeof(*msg),
+				   src + sizeof(*msg), length - sizeof(*msg));
+	} else {
+		length = (*cp_fn) (os_handle, data, src, length);
+	}
 	if (length > 0) {
 		if ((*(a->interface.cmd_proc))
 		    (a, (diva_xdi_um_cfg_cmd_t *) data, length)) {
diff --git a/drivers/isdn/hardware/eicon/diva.h b/drivers/isdn/hardware/eicon/diva.h
index b067032..eb454c5 100644
--- a/drivers/isdn/hardware/eicon/diva.h
+++ b/drivers/isdn/hardware/eicon/diva.h
@@ -20,10 +20,11 @@ int diva_xdi_read(void *adapter, void *os_handle, void __user *dst,
 		  int max_length, divas_xdi_copy_to_user_fn_t cp_fn);
 
 int diva_xdi_write(void *adapter, void *os_handle, const void __user *src,
-		   int length, divas_xdi_copy_from_user_fn_t cp_fn);
+		   int length, diva_xdi_um_cfg_cmd_t *msg,
+		   divas_xdi_copy_from_user_fn_t cp_fn);
 
 void *diva_xdi_open_adapter(void *os_handle, const void __user *src,
-			    int length,
+			    int length, diva_xdi_um_cfg_cmd_t *msg,
 			    divas_xdi_copy_from_user_fn_t cp_fn);
 
 void diva_xdi_close_adapter(void *adapter, void *os_handle);
diff --git a/drivers/isdn/hardware/eicon/divasmain.c b/drivers/isdn/hardware/eicon/divasmain.c
index b9980e8..b6a3950 100644
--- a/drivers/isdn/hardware/eicon/divasmain.c
+++ b/drivers/isdn/hardware/eicon/divasmain.c
@@ -591,19 +591,22 @@ static int divas_release(struct inode *inode, struct file *file)
 static ssize_t divas_write(struct file *file, const char __user *buf,
 			   size_t count, loff_t *ppos)
 {
+	diva_xdi_um_cfg_cmd_t msg;
 	int ret = -EINVAL;
 
 	if (!file->private_data) {
 		file->private_data = diva_xdi_open_adapter(file, buf,
-							   count,
+							   count, &msg,
 							   xdi_copy_from_user);
-	}
-	if (!file->private_data) {
-		return (-ENODEV);
+		if (!file->private_data)
+			return (-ENODEV);
+		ret = diva_xdi_write(file->private_data, file,
+				     buf, count, &msg, xdi_copy_from_user);
+	} else {
+		ret = diva_xdi_write(file->private_data, file,
+				     buf, count, NULL, xdi_copy_from_user);
 	}
 
-	ret = diva_xdi_write(file->private_data, file,
-			     buf, count, xdi_copy_from_user);
 	switch (ret) {
 	case -1:		/* Message should be removed from rx mailbox first */
 		ret = -EBUSY;
@@ -622,11 +625,12 @@ static ssize_t divas_write(struct file *file, const char __user *buf,
 static ssize_t divas_read(struct file *file, char __user *buf,
 			  size_t count, loff_t *ppos)
 {
+	diva_xdi_um_cfg_cmd_t msg;
 	int ret = -EINVAL;
 
 	if (!file->private_data) {
 		file->private_data = diva_xdi_open_adapter(file, buf,
-							   count,
+							   count, &msg,
 							   xdi_copy_from_user);
 	}
 	if (!file->private_data) {
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 05/15] mtd: nand: pxa3xx: remove the dmaengine compat need
From: Daniel Mack @ 2018-05-18 21:31 UTC (permalink / raw)
  To: Robert Jarzmik, Haojian Zhuang, Bartlomiej Zolnierkiewicz,
	Tejun Heo, Vinod Koul, Mauro Carvalho Chehab, Ulf Hansson,
	Ezequiel Garcia, Boris Brezillon, David Woodhouse, Brian Norris,
	Marek Vasut, Richard Weinberger, Cyrille Pitchen, Nicolas Pitre,
	Samuel Ortiz, Greg Kroah-Hartman, Jaroslav Kysela, Takashi Iwai,
	Liam Girdwood, Mark Brown, Arnd Bergmann
  Cc: devel, alsa-devel, netdev, linux-mmc, linux-kernel, linux-ide,
	linux-mtd, dmaengine, Robert Jarzmik, linux-arm-kernel,
	linux-media
In-Reply-To: <20180402142656.26815-6-robert.jarzmik@free.fr>

[-- Attachment #1: Type: text/plain, Size: 761 bytes --]

Hi Robert,

Thanks for this series.

On Monday, April 02, 2018 04:26 PM, Robert Jarzmik wrote:
> From: Robert Jarzmik <robert.jarzmik@renault.com>
> 
> As the pxa architecture switched towards the dmaengine slave map, the
> old compatibility mechanism to acquire the dma requestor line number and
> priority are not needed anymore.
> 
> This patch simplifies the dma resource acquisition, using the more
> generic function dma_request_slave_channel().
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>   drivers/mtd/nand/pxa3xx_nand.c | 10 +---------

This driver was replaced by drivers/mtd/nand/raw/marvell_nand.c 
recently, so this patch can be dropped. I attached a version for the new 
driver which you can pick instead.


Thanks,
Daniel

[-- Attachment #2: 0001-mtd-rawnand-marvell-remove-dmaengine-compat-code.patch --]
[-- Type: text/x-patch, Size: 1633 bytes --]

>From c63bc40bdfe2d596e42919235840109a2f1b2776 Mon Sep 17 00:00:00 2001
From: Daniel Mack <daniel@zonque.org>
Date: Sat, 12 May 2018 21:50:13 +0200
Subject: [PATCH] mtd: rawnand: marvell: remove dmaengine compat code

As the pxa architecture switched towards the dmaengine slave map, the
old compatibility mechanism to acquire the dma requestor line number and
priority are not needed anymore.

This patch simplifies the dma resource acquisition, using the more
generic function dma_request_slave_channel().

Signed-off-by: Daniel Mack <daniel@zonque.org>
---
 drivers/mtd/nand/raw/marvell_nand.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index ebb1d141b900..30017cd7d91c 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -2612,8 +2612,6 @@ static int marvell_nfc_init_dma(struct marvell_nfc *nfc)
 						    dev);
 	struct dma_slave_config config = {};
 	struct resource *r;
-	dma_cap_mask_t mask;
-	struct pxad_param param;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_PXA_DMA)) {
@@ -2632,14 +2630,7 @@ static int marvell_nfc_init_dma(struct marvell_nfc *nfc)
 		return -ENXIO;
 	}
 
-	param.drcmr = r->start;
-	param.prio = PXAD_PRIO_LOWEST;
-	dma_cap_zero(mask);
-	dma_cap_set(DMA_SLAVE, mask);
-	nfc->dma_chan =
-		dma_request_slave_channel_compat(mask, pxad_filter_fn,
-						 &param, nfc->dev,
-						 "data");
+	nfc->dma_chan = dma_request_slave_channel(nfc->dev, "data");
 	if (!nfc->dma_chan) {
 		dev_err(nfc->dev,
 			"Unable to request data DMA channel\n");
-- 
2.14.3


[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: [patch net-next 0/5] devlink: introduce port flavours and common phys_port_name generation
From: Jakub Kicinski @ 2018-05-18 21:16 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, idosch, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

On Fri, 18 May 2018 09:28:59 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> This patchset resolves 2 issues we have right now:
> 1) There are many netdevices / ports in the system, for port, pf, vf
>    represenatation but the user has no way to see which is which
> 2) The ndo_get_phys_port_name is implemented in each driver separatelly,
>    which may lead to inconsistent names between drivers.
> 
> This patchset introduces port flavours which should address the first
> problem. In this initial patchset, I focus on DSA and their port
> flavours. As a follow-up, I plan to add PF and VF representor flavours.
> However, that needs additional dependencies in drivers (nfp, mlx5).
> 
> The common phys_port_name generation is used by mlxsw. An example output
> for mlxsw looks like this:

FWIW this series LGTM!

^ permalink raw reply

* [RFC PATCH 5/6] net: ethernet: ti: cpsw: restore shaper configuration while down/up
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

Need to restore shapers configuration after interface was down/up.
This is needed as appropriate configuration is still replicated in
kernel settings. This only shapers context restore, so vlan
configuration should be restored by user if needed, especially for
devices with one port where vlan frames are sent via ALE.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 47 ++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index c7710b0e1c17..c3e88be36c1b 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1807,6 +1807,51 @@ static int cpsw_set_cbs(struct net_device *ndev,
 	return ret;
 }
 
+static void cpsw_cbs_resume(struct cpsw_slave *slave, struct cpsw_priv *priv)
+{
+	int fifo, bw;
+
+	for (fifo = CPSW_FIFO_SHAPERS_NUM; fifo > 0; fifo--) {
+		bw = priv->fifo_bw[fifo];
+		if (!bw)
+			continue;
+
+		cpsw_set_fifo_rlimit(priv, fifo, bw);
+	}
+}
+
+static void cpsw_mqprio_resume(struct cpsw_slave *slave, struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u32 tx_prio_map = 0;
+	int i, tc, fifo;
+	u32 tx_prio_rg;
+
+	if (!priv->mqprio_hw)
+		return;
+
+	for (i = 0; i < 8; i++) {
+		tc = netdev_get_prio_tc_map(priv->ndev, i);
+		fifo = CPSW_FIFO_SHAPERS_NUM - tc;
+		tx_prio_map |= fifo << (4 * i);
+	}
+
+	tx_prio_rg = cpsw->version == CPSW_VERSION_1 ?
+		     CPSW1_TX_PRI_MAP : CPSW2_TX_PRI_MAP;
+
+	slave_write(slave, tx_prio_map, tx_prio_rg);
+}
+
+/* restore resources after port reset */
+static void cpsw_restore(struct cpsw_priv *priv)
+{
+	/* restore MQPRIO offload */
+	for_each_slave(priv, cpsw_mqprio_resume, priv);
+
+	/* restore CBS offload */
+	for_each_slave(priv, cpsw_cbs_resume, priv);
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -1886,6 +1931,8 @@ static int cpsw_ndo_open(struct net_device *ndev)
 
 	}
 
+	cpsw_restore(priv);
+
 	/* Enable Interrupt pacing if configured */
 	if (cpsw->coal_intvl != 0) {
 		struct ethtool_coalesce coal;
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 3/6] net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

That's possible to offload vlan to tc priority mapping with
assumption sk_prio == L2 prio.

Example:
$ ethtool -L eth0 rx 1 tx 4

$ qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

Here, 100:1 is txq0, 100:2 is txq1, 100:3 is txq2, 100:4 is txq3
txq0 belongs to tc0, txq1 to tc1, txq2 and txq3 to tc2
The offload part only maps L2 prio to classes of traffic, but not
to transmit queues, so to direct traffic to traffic class vlan has
to be created with appropriate egress map.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 82 ++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 9bd615da04d3..4b232cda5436 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -39,6 +39,7 @@
 #include <linux/sys_soc.h>
 
 #include <linux/pinctrl/consumer.h>
+#include <net/pkt_cls.h>
 
 #include "cpsw.h"
 #include "cpsw_ale.h"
@@ -153,6 +154,8 @@ do {								\
 #define IRQ_NUM			2
 #define CPSW_MAX_QUEUES		8
 #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
+#define CPSW_TC_NUM			4
+#define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
 
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
@@ -453,6 +456,7 @@ struct cpsw_priv {
 	u8				mac_addr[ETH_ALEN];
 	bool				rx_pause;
 	bool				tx_pause;
+	bool				mqprio_hw;
 	u32 emac_port;
 	struct cpsw_common *cpsw;
 };
@@ -1577,6 +1581,14 @@ static void cpsw_slave_stop(struct cpsw_slave *slave, struct cpsw_common *cpsw)
 	soft_reset_slave(slave);
 }
 
+static int cpsw_tc_to_fifo(int tc, int num_tc)
+{
+	if (tc == num_tc - 1)
+		return 0;
+
+	return CPSW_FIFO_SHAPERS_NUM - tc;
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -2190,6 +2202,75 @@ static int cpsw_ndo_set_tx_maxrate(struct net_device *ndev, int queue, u32 rate)
 	return ret;
 }
 
+static int cpsw_set_tc(struct net_device *ndev, void *type_data)
+{
+	struct tc_mqprio_qopt_offload *mqprio = type_data;
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int fifo, num_tc, count, offset;
+	struct cpsw_slave *slave;
+	u32 tx_prio_map = 0;
+	int i, tc, ret;
+
+	num_tc = mqprio->qopt.num_tc;
+	if (num_tc > CPSW_TC_NUM)
+		return -EINVAL;
+
+	if (mqprio->mode != TC_MQPRIO_MODE_DCB)
+		return -EINVAL;
+
+	ret = pm_runtime_get_sync(cpsw->dev);
+	if (ret < 0) {
+		pm_runtime_put_noidle(cpsw->dev);
+		return ret;
+	}
+
+	if (num_tc) {
+		for (i = 0; i < 8; i++) {
+			tc = mqprio->qopt.prio_tc_map[i];
+			fifo = cpsw_tc_to_fifo(tc, num_tc);
+			tx_prio_map |= fifo << (4 * i);
+		}
+
+		netdev_set_num_tc(ndev, num_tc);
+		for (i = 0; i < num_tc; i++) {
+			count = mqprio->qopt.count[i];
+			offset = mqprio->qopt.offset[i];
+			netdev_set_tc_queue(ndev, i, count, offset);
+		}
+	}
+
+	if (!mqprio->qopt.hw) {
+		/* restore default configuration */
+		netdev_reset_tc(ndev);
+		tx_prio_map = TX_PRIORITY_MAPPING;
+	}
+
+	priv->mqprio_hw = mqprio->qopt.hw;
+
+	offset = cpsw->version == CPSW_VERSION_1 ?
+		 CPSW1_TX_PRI_MAP : CPSW2_TX_PRI_MAP;
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	slave_write(slave, tx_prio_map, offset);
+
+	pm_runtime_put_sync(cpsw->dev);
+
+	return 0;
+}
+
+static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
+			     void *type_data)
+{
+	switch (type) {
+	case TC_SETUP_QDISC_MQPRIO:
+		return cpsw_set_tc(ndev, type_data);
+
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static const struct net_device_ops cpsw_netdev_ops = {
 	.ndo_open		= cpsw_ndo_open,
 	.ndo_stop		= cpsw_ndo_stop,
@@ -2205,6 +2286,7 @@ static const struct net_device_ops cpsw_netdev_ops = {
 #endif
 	.ndo_vlan_rx_add_vid	= cpsw_ndo_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= cpsw_ndo_vlan_rx_kill_vid,
+	.ndo_setup_tc           = cpsw_ndo_setup_tc,
 };
 
 static int cpsw_get_regs_len(struct net_device *ndev)
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 6/6] Documentation: networking: cpsw: add MQPRIO & CBS offload examples
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

This document describes MQPRIO and CBS Qdisc offload configuration
for cpsw driver based on examples. It potentially can be used in
audio video bridging (AVB) and time sensitive networking (TSN).

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 Documentation/networking/cpsw.txt | 540 ++++++++++++++++++++++++++++++
 1 file changed, 540 insertions(+)
 create mode 100644 Documentation/networking/cpsw.txt

diff --git a/Documentation/networking/cpsw.txt b/Documentation/networking/cpsw.txt
new file mode 100644
index 000000000000..28c64896d59d
--- /dev/null
+++ b/Documentation/networking/cpsw.txt
@@ -0,0 +1,540 @@
+* Texas Instruments CPSW ethernet driver
+
+Multiqueue & CBS & MQPRIO
+=====================================================================
+=====================================================================
+
+The cpsw has 3 CBS shapers for each external ports. This document
+describes MQPRIO and CBS Qdisc offload configuration for cpsw driver
+based on examples. It potentially can be used in audio video bridging
+(AVB) and time sensitive networking (TSN).
+
+The following examples was tested on AM572x EVM and BBB boards.
+
+Test setup
+==========
+
+Under consideration two examples with AM52xx EVM running cpsw driver
+in dual_emac mode.
+
+Several prerequisites:
+- TX queues must be rated starting from txq0 that has highest priority
+- Traffic classes are used starting from 0, that has highest priority
+- CBS shapers should be used with rated queues
+- The bandwidth for CBS shapers has to be set a little bit more then
+  potential incoming rate, thus, rate of all incoming tx queues has
+  to be a little less
+- Real rates can differ, due to discreetness
+- Map skb-priority to txq is not enough, also skb-priority to l2 prio
+  map has to be created with ip or vconfig tool
+- Any l2/socket prio (0 - 7) for classes can be used, but for
+  simplicity default values are used: 3 and 2
+- only 2 classes tested: A and B, but checked and can work with more,
+  maximum allowed 4, but only for 3 rate can be set.
+
+Test setup for examples
+=======================
+                                    +-------------------------------+
+                                    |--+                            |
+                                    |  |      Workstation0          |
+                                    |E |  MAC 18:03:73:66:87:42     |
++-----------------------------+  +--|t |                            |
+|                    | 1  | E |  |  |h |./tsn_listener -d \         |
+|  Target board:     | 0  | t |--+  |0 | 18:03:73:66:87:42 -i eth0 \|
+|  AM572x EVM        | 0  | h |     |  | -s 1500                    |
+|                    | 0  | 0 |     |--+                            |
+|  Only 2 classes:   |Mb  +---|     +-------------------------------+
+|  class A, class B  |        |
+|                    |    +---|     +-------------------------------+
+|                    | 1  | E |     |--+                            |
+|                    | 0  | t |     |  |      Workstation1          |
+|                    | 0  | h |--+  |E |  MAC 20:cf:30:85:7d:fd     |
+|                    |Mb  | 1 |  +--|t |                            |
++-----------------------------+     |h |./tsn_listener -d \         |
+                                    |0 | 20:cf:30:85:7d:fd -i eth0 \|
+                                    |  | -s 1500                    |
+                                    |--+                            |
+                                    +-------------------------------+
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 1: One port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM52xx evm, applicable for single port boards)
+
+tc - traffic class
+txq - transmit queue
+p - priority
+f - fifo (cpsw fifo)
+S - shaper configured
+
++------------------------------------------------------------------+ u
+| +---------------+  +---------------+  +------+ +------+          | s
+| |               |  |               |  |      | |      |          | e
+| | App 1         |  | App 2         |  | Apps | | Apps |          | r
+| | Class A       |  | Class B       |  | Rest | | Rest |          |
+| | Eth0          |  | Eth0          |  | Eth0 | | Eth1 |          | s
+| | VLAN100       |  | VLAN100       |  |   |  | |   |  |          | p
+| | 40 Mb/s       |  | 20 Mb/s       |  |   |  | |   |  |          | a
+| | SO_PRIORITY=3 |  | SO_PRIORITY=2 |  |   |  | |   |  |          | c
+| |   |           |  |   |           |  |   |  | |   |  |          | e
+| +---|-----------+  +---|-----------+  +---|--+ +---|--+          |
++-----|------------------|------------------|--------|-------------+
+    +-+     +------------+                  |        |
+    |       |             +-----------------+     +--+
+    |       |             |                       |
++---|-------|-------------|-----------------------|----------------+
+| +----+ +----+ +----+ +----+                   +----+             |
+| | p3 | | p2 | | p1 | | p0 |                   | p0 |             | k
+| \    / \    / \    / \    /                   \    /             | e
+|  \  /   \  /   \  /   \  /                     \  /              | r
+|   \/     \/     \/     \/                       \/               | n
+|    |     |             |                        |                | e
+|    |     |       +-----+                        |                | l
+|    |     |       |                              |                |
+| +----+ +----+ +----+                          +----+             | s
+| |tc0 | |tc1 | |tc2 |                          |tc0 |             | p
+| \    / \    / \    /                          \    /             | a
+|  \  /   \  /   \  /                            \  /              | c
+|   \/     \/     \/                              \/               | e
+|   |      |       +-----+                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+| +----+ +----+ +----+ +----+                   +----+             |
+| |txq0| |txq1| |txq2| |txq3|                   |txq4|             |
+| \    / \    / \    / \    /                   \    /             |
+|  \  /   \  /   \  /   \  /                     \  /              |
+|   \/     \/     \/     \/                       \/               |
+| +-|------|------|------|--+                  +--|--------------+ |
+| | |      |      |      |  | Eth0.100         |  |     Eth1     | |
++---|------|------|------|------------------------|----------------+
+    |      |      |      |                        |
+    p      p      p      p                        |
+    3      2      0-1, 4-7  <- L2 priority        |
+    |      |      |      |                        |
+    |      |      |      |                        |
++---|------|------|------|------------------------|----------------+
+|   |      |      |      |             |----------+                |
+| +----+ +----+ +----+ +----+       +----+                         |
+| |dma7| |dma6| |dma5| |dma4|       |dma4|                         |
+| \    / \    / \    / \    /       \    /                         | c
+|  \S /   \S /   \  /   \  /         \  /                          | p
+|   \/     \/     \/     \/           \/                           | s
+|   |      |      | +-----            |                            | w
+|   |      |      | |                 |                            |
+|   |      |      | |                 |                            | d
+| +----+ +----+ +----+p            p+----+                         | r
+| |    | |    | |    |o            o|    |                         | i
+| | f3 | | f2 | | f0 |r            r| f0 |                         | v
+| |tc0 | |tc1 | |tc2 |t            t|tc0 |                         | e
+| \CBS / \CBS / \CBS /1            2\CBS /                         | r
+|  \S /   \S /   \  /                \  /                          |
+|   \/     \/     \/                  \/                           |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 4 tx queues, for interface Eth0, and 1 tx queue for Eth1
+$ ethtool -L eth0 rx 1 tx 5
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             5
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1
+// Set rates 40 and 20 Mb/s appropriately.
+// Pay attention, real speed can differ a bit due to discreetness.
+// Leave last 2 tx queues not rated.
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq2, txq3)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
+
+5a)
+// As two interface sharing same set of tx queues, assign all traffic
+// coming to interface Eth1 to separate queue in order to not mix it
+// with traffic from interface Eth0, so use separate txq to send
+// packets to Eth1, so all prio -> tc0 and tc0 -> txq4
+// Here hw 0, so here still default configuration for eth1 in hw
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 1 \
+map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@4 hw 0
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:3) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+$ tc -g class show dev eth1
++---(100:ffe0) mqprio
+     +---(100:5) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc
+// Set it +1 Mb for reserve (important!)
+// here only idle slope is important, others arg are ignored
+// Pay attention, real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1438 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc:
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1468 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos
+$ ip link add link eth0 name eth0.100 type vlan id 100
+8021q: 802.1Q VLAN Support v1.8
+8021q: adding VLAN 0 to HW filter on device eth0
+8021q: adding VLAN 0 to HW filter on device eth1
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio, 1 to 1
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Run your appropriate tools with socket option "SO_PRIORITY"
+// to 3 for class A and/or to 2 for class B
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+
+13)
+// run your listener on workstation
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+14)
+// Restore default configuration if needed
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 2: Two port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM52xx evm, for dual emac boards only)
+
++------------------------------------------------------------------+ u
+| +----------+  +----------+  +------+  +----------+  +----------+ | s
+| |          |  |          |  |      |  |          |  |          | | e
+| | App 1    |  | App 2    |  | Apps |  | App 3    |  | App 4    | | r
+| | Class A  |  | Class B  |  | Rest |  | Class B  |  | Class A  | |
+| | Eth0     |  | Eth0     |  |   |  |  | Eth1     |  | Eth1     | | s
+| | VLAN100  |  | VLAN100  |  |   |  |  | VLAN100  |  | VLAN100  | | p
+| | 40 Mb/s  |  | 20 Mb/s  |  |   |  |  | 10 Mb/s  |  | 30 Mb/s  | | a
+| | SO_PRI=3 |  | SO_PRI=2 |  |   |  |  | SO_PRI=3 |  | SO_PRI=2 | | c
+| |   |      |  |   |      |  |   |  |  |   |      |  |   |      | | e
+| +---|------+  +---|------+  +---|--+  +---|------+  +---|------+ |
++-----|-------------|-------------|---------|-------------|--------+
+    +-+     +-------+             |         +----------+  +----+
+    |       |             +-------+------+             |       |
+    |       |             |              |             |       |
++---|-------|-------------|--------------|-------------|-------|---+
+| +----+ +----+ +----+ +----+          +-+--+ +----+ +----+ +----+ |
+| | p3 | | p2 | | p1 | | p0 |          | p0 | | p1 | | p2 | | p3 | | k
+| \    / \    / \    / \    /          \    / \    / \    / \    / | e
+|  \  /   \  /   \  /   \  /            \  /   \  /   \  /   \  /  | r
+|   \/     \/     \/     \/              \/     \/     \/     \/   | n
+|   |      |             |                |             |      |   | e
+|   |      |        +----+                +----+        |      |   | l
+|   |      |        |                          |        |      |   |
+| +----+ +----+ +----+                        +----+ +----+ +----+ | s
+| |tc0 | |tc1 | |tc2 |                        |tc2 | |tc1 | |tc0 | | p
+| \    / \    / \    /                        \    / \    / \    / | a
+|  \  /   \  /   \  /                          \  /   \  /   \  /  | c
+|   \/     \/     \/                            \/     \/     \/   | e
+|   |      |       +-----+                +-----+      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |    E      E    |     |      |       |   |
+| +----+ +----+ +----+ +----+ t      t +----+ +----+ +----+ +----+ |
+| |txq0| |txq1| |txq4| |txq5| h      h |txq6| |txq7| |txq3| |txq2| |
+| \    / \    / \    / \    / 0      1 \    / \    / \    / \    / |
+|  \  /   \  /   \  /   \  /  .      .  \  /   \  /   \  /   \  /  |
+|   \/     \/     \/     \/   1      1   \/     \/     \/     \/   |
+| +-|------|------|------|--+ 0      0 +-|------|------|------|--+ |
+| | |      |      |      |  | 0      0 | |      |      |      |  | |
++---|------|------|------|---------------|------|------|------|----+
+    |      |      |      |               |      |      |      |
+    p      p      p      p               p      p      p      p
+    3      2      0-1, 4-7   <-L2 pri->  0-1, 4-7      2      3
+    |      |      |      |               |      |      |      |
+    |      |      |      |               |      |      |      |
++---|------|------|------|---------------|------|------|------|----+
+|   |      |      |      |               |      |      |      |    |
+| +----+ +----+ +----+ +----+          +----+ +----+ +----+ +----+ |
+| |dma7| |dma6| |dma3| |dma2|          |dma1| |dma0| |dma4| |dma5| |
+| \    / \    / \    / \    /          \    / \    / \    / \    / | c
+|  \S /   \S /   \  /   \  /            \  /   \  /   \S /   \S /  | p
+|   \/     \/     \/     \/              \/     \/     \/     \/   | s
+|   |      |      | +-----                |      |      |      |   | w
+|   |      |      | |                     +----+ |      |      |   |
+|   |      |      | |                          | |      |      |   | d
+| +----+ +----+ +----+p                      p+----+ +----+ +----+ | r
+| |    | |    | |    |o                      o|    | |    | |    | | i
+| | f3 | | f2 | | f0 |r        CPSW          r| f3 | | f2 | | f0 | | v
+| |tc0 | |tc1 | |tc2 |t                      t|tc0 | |tc1 | |tc2 | | e
+| \CBS / \CBS / \CBS /1                      2\CBS / \CBS / \CBS / | r
+|  \S /   \S /   \  /                          \S /   \S /   \  /  |
+|   \/     \/     \/                            \/     \/     \/   |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 8 tx queues, for interface Eth0, but they are common, so are accessed
+// by two interfaces Eth0 and Eth1.
+$ ethtool -L eth1 rx 1 tx 8
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             8
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1 for Eth0
+// and for tx2 and tx3 for Eth1. That is, rates 40 and 20 Mb/s appropriately
+// for Eth0 and 30 and 10 Mb/s for Eth1.
+// Real speed can differ a bit due to discreetness
+// Leave last 4 tx queues as not rated
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+$ echo 30 > /sys/class/net/eth1/queues/tx-2/tx_maxrate
+$ echo 10 > /sys/class/net/eth1/queues/tx-3/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+30
+10
+0
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class for Eth0:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq4, txq5)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@4 hw 1
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:5) mqprio
+|    +---(100:6) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc for Eth0
+// here only idle slope is important, others ignored
+// Real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1470 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc for Eth0
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos for Eth0
+$ ip link add link eth0 name eth0.100 type vlan id 100
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio for Eth0.100, one to one
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Map skb->priority to traffic class for Eth1:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq2, tc1 -> txq3, tc2 -> (txq6, txq7)
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@2 1@3 2@6 hw 1
+
+13)
+// Check classes settings
+$ tc -g class show dev eth1
++---(100:ffe2) mqprio
+|    +---(100:7) mqprio
+|    +---(100:8) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:3) mqprio
+
+14)
+// Set rate for class A - 31 Mbit (tc0, txq2) using CBS Qdisc for Eth1
+// here only idle slope is important, others ignored
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:1 cbs locredit -1453 \
+hicredit 47 sendslope -969000 idleslope 31000 offload 1
+net eth1: set FIFO3 bw = 40
+
+15)
+// Set rate for class B - 11 Mbit (tc1, txq3) using CBS Qdisc for Eth1
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:2 cbs locredit -1483 \
+hicredit 34 sendslope -989000 idleslope 11000 offload 1
+net eth1: set FIFO2 bw = 20
+
+16)
+// Create vlan 100 to map sk->priority to vlan qos for Eth1
+$ ip link add link eth1 name eth1.100 type vlan id 100
+net eth1: Adding vlanid 100 to vlan filter
+
+17)
+// Map skb->priority to L2 prio for Eth1.100, one to one
+$ ip link set eth1.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+18)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth1.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+19)
+// Run appropriate tools with socket option "SO_PRIORITY" to 3
+// for class A and to 2 for class B. For both interfaces
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p2 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p3 -s 1500&
+
+20)
+// run your listeners on workstations
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+21)
+// Restore default configuration if needed
+$ ip link del eth1.100
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+net eth1: Prev FIFO2 is shaped
+net eth1: set FIFO3 bw = 0
+net eth1: set FIFO2 bw = 0
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 4/6] net: ethernet: ti: cpsw: add CBS Qdisc offload
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate
limited queue with shaping. In order to set and enable shaping for
those 3 FIFOs queues the network device with CBS qdisc attached is
needed. The CBS configuration is added for dual-emac/single port mode
only, but potentially can be used in switch mode also, based on
switchdev for instance.

Despite the FIFO shapers can work w/o cpdma level shapers the base
usage must be in combine with cpdma level shapers as described in TRM,
that are set as maximum rates for interface queues with sysfs.

One of the possible configuration with txq shapers and CBS shapers:

                      Configured with echo RATE >
                  /sys/class/net/eth0/queues/tx-0/tx_maxrate
             /---------------------------------------------------
            /
           /            cpdma level shapers
        +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
        | c7 | | c6 | | c5 | | c4 | | c3 | | c2 | | c1 | | c0 |
        \    / \    / \    / \    / \    / \    / \    / \    /
         \  /   \  /   \  /   \  /   \  /   \  /   \  /   \  /
          \/     \/     \/     \/     \/     \/     \/     \/
+---------|------|------|------|-------------------------------------+
|    +----+      |      |  +---+                                     |
|    |      +----+      |  |                                         |
|    v      v           v  v                                         |
| +----+ +----+ +----+ +----+ p        p+----+ +----+ +----+ +----+  |
| |    | |    | |    | |    | o        o|    | |    | |    | |    |  |
| | f3 | | f2 | | f1 | | f0 | r  CPSW  r| f3 | | f2 | | f1 | | f0 |  |
| |    | |    | |    | |    | t        t|    | |    | |    | |    |  |
| \    / \    / \    / \    / 0        1\    / \    / \    / \    /  |
|  \  X   \  /   \  /   \  /             \  /   \  /   \  /   \  /   |
|   \/ \   \/     \/     \/               \/     \/     \/     \/    |
+-------\------------------------------------------------------------+
         \
          \ FIFO shaper, set with CBS offload added in this patch,
           \ FIFO0 cannot be rate limited
            ------------------------------------------------------

CBS shaper configuration is supposed to be used with root MQPRIO Qdisc
offload allowing to add sk_prio->tc->txq maps that direct traffic to
appropriate tx queue and maps L2 priority to FIFO shaper.

The CBS shaper is intended to be used for AVB where L2 priority
(pcp field) is used to differentiate class of traffic. So additionally
vlan needs to be created with appropriate egress sk_prio->l2 prio map.

If CBS has several tx queues assigned to it, the sum of their
bandwidth has not overlap bandwidth set for CBS. It's recomended the
CBS bandwidth to be a little bit more.

The CBS shaper is configured with CBS qdisc offload interface using tc
tool from iproute2 packet.

For instance:

$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \
hicredit 60 sendslope -960000 idleslope 40000 offload 1

$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
hicredit 62 sendslope -980000 idleslope 20000 offload 1

The above code set CBS shapers for tc0 and tc1, for that txq0 and
txq1 is used. Pay attention, the real set bandwidth can differ a bit
due to discreteness of configuration parameters.

Here parameters like locredit, hicredit and sendslope are ignored
internally and are supposed to be set with assumption that maximum
frame size for frame - 1500.

It's supposed that interface speed is not changed while reconnection,
not always is true, so inform user in case speed of interface was
changed, as it can impact on dependent shapers configuration.

For more examples see Documentation.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 221 +++++++++++++++++++++++++++++++++
 1 file changed, 221 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 4b232cda5436..c7710b0e1c17 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -46,6 +46,8 @@
 #include "cpts.h"
 #include "davinci_cpdma.h"
 
+#include <net/pkt_sched.h>
+
 #define CPSW_DEBUG	(NETIF_MSG_HW		| NETIF_MSG_WOL		| \
 			 NETIF_MSG_DRV		| NETIF_MSG_LINK	| \
 			 NETIF_MSG_IFUP		| NETIF_MSG_INTR	| \
@@ -154,8 +156,12 @@ do {								\
 #define IRQ_NUM			2
 #define CPSW_MAX_QUEUES		8
 #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
+#define CPSW_FIFO_QUEUE_TYPE_SHIFT	16
+#define CPSW_FIFO_SHAPE_EN_SHIFT	16
+#define CPSW_FIFO_RATE_EN_SHIFT		20
 #define CPSW_TC_NUM			4
 #define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
+#define CPSW_PCT_MASK			0x7f
 
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
@@ -457,6 +463,8 @@ struct cpsw_priv {
 	bool				rx_pause;
 	bool				tx_pause;
 	bool				mqprio_hw;
+	int				fifo_bw[CPSW_TC_NUM];
+	int				shp_cfg_speed;
 	u32 emac_port;
 	struct cpsw_common *cpsw;
 };
@@ -1081,6 +1089,38 @@ static void cpsw_set_slave_mac(struct cpsw_slave *slave,
 	slave_write(slave, mac_lo(priv->mac_addr), SA_LO);
 }
 
+static bool cpsw_shp_is_off(struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 shift, mask, val;
+
+	val = readl_relaxed(&cpsw->regs->ptype);
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
+	mask = 7 << shift;
+	val = val & mask;
+
+	return !val;
+}
+
+static void cpsw_fifo_shp_on(struct cpsw_priv *priv, int fifo, int on)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 shift, mask, val;
+
+	val = readl_relaxed(&cpsw->regs->ptype);
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
+	mask = (1 << --fifo) << shift;
+	val = on ? val | mask : val & ~mask;
+
+	writel_relaxed(val, &cpsw->regs->ptype);
+}
+
 static void _cpsw_adjust_link(struct cpsw_slave *slave,
 			      struct cpsw_priv *priv, bool *link)
 {
@@ -1120,6 +1160,12 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
 			mac_control |= BIT(4);
 
 		*link = true;
+
+		if (priv->shp_cfg_speed &&
+		    priv->shp_cfg_speed != slave->phy->speed &&
+		    !cpsw_shp_is_off(priv))
+			dev_warn(priv->dev,
+				 "Speed was changed, CBS sahper speeds are changed!");
 	} else {
 		mac_control = 0;
 		/* disable forwarding */
@@ -1589,6 +1635,178 @@ static int cpsw_tc_to_fifo(int tc, int num_tc)
 	return CPSW_FIFO_SHAPERS_NUM - tc;
 }
 
+static int cpsw_set_fifo_bw(struct cpsw_priv *priv, int fifo, int bw)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u32 val = 0, send_pct, shift;
+	struct cpsw_slave *slave;
+	int pct = 0, i;
+
+	if (bw > priv->shp_cfg_speed * 1000)
+		goto err;
+
+	/* shaping has to stay enabled for highest fifos linearly
+	 * and fifo bw no more then interface can allow
+	 */
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	send_pct = slave_read(slave, SEND_PERCENT);
+	for (i = CPSW_FIFO_SHAPERS_NUM; i > 0; i--) {
+		if (!bw) {
+			if (i >= fifo || !priv->fifo_bw[i])
+				continue;
+
+			dev_warn(priv->dev, "Prev FIFO%d is shaped", i);
+			continue;
+		}
+
+		if (!priv->fifo_bw[i] && i > fifo) {
+			dev_err(priv->dev, "Upper FIFO%d is not shaped", i);
+			return -EINVAL;
+		}
+
+		shift = (i - 1) * 8;
+		if (i == fifo) {
+			send_pct &= ~(CPSW_PCT_MASK << shift);
+			val = DIV_ROUND_UP(bw, priv->shp_cfg_speed * 10);
+			if (!val)
+				val = 1;
+
+			send_pct |= val << shift;
+			pct += val;
+			continue;
+		}
+
+		if (priv->fifo_bw[i])
+			pct += (send_pct >> shift) & CPSW_PCT_MASK;
+	}
+
+	if (pct >= 100)
+		goto err;
+
+	slave_write(slave, send_pct, SEND_PERCENT);
+	priv->fifo_bw[fifo] = bw;
+
+	dev_warn(priv->dev, "set FIFO%d bw = %d\n", fifo,
+		 DIV_ROUND_CLOSEST(val * priv->shp_cfg_speed, 100));
+
+	return 0;
+err:
+	dev_err(priv->dev, "Bandwidth doesn't fit in tc configuration");
+	return -EINVAL;
+}
+
+static int cpsw_set_fifo_rlimit(struct cpsw_priv *priv, int fifo, int bw)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 tx_in_ctl_rg, val;
+	int ret;
+
+	ret = cpsw_set_fifo_bw(priv, fifo, bw);
+	if (ret)
+		return ret;
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	tx_in_ctl_rg = cpsw->version == CPSW_VERSION_1 ?
+		       CPSW1_TX_IN_CTL : CPSW2_TX_IN_CTL;
+
+	if (!bw)
+		cpsw_fifo_shp_on(priv, fifo, bw);
+
+	val = slave_read(slave, tx_in_ctl_rg);
+	if (cpsw_shp_is_off(priv)) {
+		/* disable FIFOs rate limited queues */
+		val &= ~(0xf << CPSW_FIFO_RATE_EN_SHIFT);
+
+		/* set type of FIFO queues to normal priority mode */
+		val &= ~(3 << CPSW_FIFO_QUEUE_TYPE_SHIFT);
+
+		/* set type of FIFO queues to be rate limited */
+		if (bw)
+			val |= 2 << CPSW_FIFO_QUEUE_TYPE_SHIFT;
+		else
+			priv->shp_cfg_speed = 0;
+	}
+
+	/* toggle a FIFO rate limited queue */
+	if (bw)
+		val |= BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
+	else
+		val &= ~BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
+	slave_write(slave, val, tx_in_ctl_rg);
+
+	/* FIFO transmit shape enable */
+	cpsw_fifo_shp_on(priv, fifo, bw);
+	return 0;
+}
+
+/* Defaults:
+ * class A - prio 3
+ * class B - prio 2
+ * shaping for class A should be set first
+ */
+static int cpsw_set_cbs(struct net_device *ndev,
+			struct tc_cbs_qopt_offload *qopt)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	int prev_speed = 0;
+	int tc, ret, fifo;
+	u32 bw = 0;
+
+	tc = netdev_txq_to_tc(priv->ndev, qopt->queue);
+
+	/* enable channels in backward order, as highest FIFOs must be rate
+	 * limited first and for compliance with CPDMA rate limited channels
+	 * that also used in bacward order. FIFO0 cannot be rate limited.
+	 */
+	fifo = cpsw_tc_to_fifo(tc, ndev->num_tc);
+	if (!fifo) {
+		dev_err(priv->dev, "Last tc%d can't be rate limited", tc);
+		return -EINVAL;
+	}
+
+	/* do nothing, it's disabled anyway */
+	if (!qopt->enable && !priv->fifo_bw[fifo])
+		return 0;
+
+	/* shapers can be set if link speed is known */
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	if (slave->phy && slave->phy->link) {
+		if (priv->shp_cfg_speed &&
+		    priv->shp_cfg_speed != slave->phy->speed)
+			prev_speed = priv->shp_cfg_speed;
+
+		priv->shp_cfg_speed = slave->phy->speed;
+	}
+
+	if (!priv->shp_cfg_speed) {
+		dev_err(priv->dev, "Link speed is not known");
+		return -1;
+	}
+
+	ret = pm_runtime_get_sync(cpsw->dev);
+	if (ret < 0) {
+		pm_runtime_put_noidle(cpsw->dev);
+		return ret;
+	}
+
+	bw = qopt->enable ? qopt->idleslope : 0;
+	ret = cpsw_set_fifo_rlimit(priv, fifo, bw);
+	if (ret) {
+		priv->shp_cfg_speed = prev_speed;
+		prev_speed = 0;
+	}
+
+	if (bw && prev_speed)
+		dev_warn(priv->dev,
+			 "Speed was changed, CBS sahper speeds are changed!");
+
+	pm_runtime_put_sync(cpsw->dev);
+	return ret;
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -2263,6 +2481,9 @@ static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
 			     void *type_data)
 {
 	switch (type) {
+	case TC_SETUP_QDISC_CBS:
+		return cpsw_set_cbs(ndev, type_data);
+
 	case TC_SETUP_QDISC_MQPRIO:
 		return cpsw_set_tc(ndev, type_data);
 
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 2/6] net: ethernet: ti: cpdma: fit rated channels in backward order
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

According to TRM tx rated channels should be in 7..0 order,
so correct it.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 31 ++++++++++++-------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index 31ae04117f0a..37fbdc668cc7 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -406,37 +406,36 @@ static int cpdma_chan_fit_rate(struct cpdma_chan *ch, u32 rate,
 	struct cpdma_chan *chan;
 	u32 old_rate = ch->rate;
 	u32 new_rmask = 0;
-	int rlim = 1;
+	int rlim = 0;
 	int i;
 
-	*prio_mode = 0;
 	for (i = tx_chan_num(0); i < tx_chan_num(CPDMA_MAX_CHANNELS); i++) {
 		chan = ctlr->channels[i];
-		if (!chan) {
-			rlim = 0;
+		if (!chan)
 			continue;
-		}
 
 		if (chan == ch)
 			chan->rate = rate;
 
 		if (chan->rate) {
-			if (rlim) {
-				new_rmask |= chan->mask;
-			} else {
-				ch->rate = old_rate;
-				dev_err(ctlr->dev, "Prev channel of %dch is not rate limited\n",
-					chan->chan_num);
-				return -EINVAL;
-			}
-		} else {
-			*prio_mode = 1;
-			rlim = 0;
+			rlim = 1;
+			new_rmask |= chan->mask;
+			continue;
 		}
+
+		if (rlim)
+			goto err;
 	}
 
 	*rmask = new_rmask;
+	*prio_mode = rlim;
 	return 0;
+
+err:
+	ch->rate = old_rate;
+	dev_err(ctlr->dev, "Upper cpdma ch%d is not rate limited\n",
+		chan->chan_num);
+	return -EINVAL;
 }
 
 static u32 cpdma_chan_set_factors(struct cpdma_ctlr *ctlr,
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 1/6] net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk
In-Reply-To: <20180518211510.13341-1-ivan.khoronzhuk@linaro.org>

The cpdma channel highest priority is from hi to lo number.
The driver has limited number of descriptors that are shared between
number of cpdma channels. Number of queues can be tuned with ethtool,
that allows to not spend descriptors on not needed cpdma channels.
In AVB usually only 2 tx queues can be enough with rate limitation.
The rate limitation can be used only for hi priority queues. Thus, to
use only 2 queues the 8 has to be created. It's wasteful.

So, in order to allow using only needed number of rate limited
tx queues, save resources, and be able to set rate limitation for
them, let assign tx cpdma channels in backward order to queues.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index a7285dddfd29..9bd615da04d3 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -967,8 +967,8 @@ static int cpsw_tx_mq_poll(struct napi_struct *napi_tx, int budget)
 
 	/* process every unprocessed channel */
 	ch_map = cpdma_ctrl_txchs_state(cpsw->dma);
-	for (ch = 0, num_tx = 0; ch_map; ch_map >>= 1, ch++) {
-		if (!(ch_map & 0x01))
+	for (ch = 0, num_tx = 0; ch_map & 0xff; ch_map <<= 1, ch++) {
+		if (!(ch_map & 0x80))
 			continue;
 
 		txv = &cpsw->txv[ch];
@@ -2431,7 +2431,7 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 	void (*handler)(void *, int, int);
 	struct netdev_queue *queue;
 	struct cpsw_vector *vec;
-	int ret, *ch;
+	int ret, *ch, vch;
 
 	if (rx) {
 		ch = &cpsw->rx_ch_num;
@@ -2444,7 +2444,8 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 	}
 
 	while (*ch < ch_num) {
-		vec[*ch].ch = cpdma_chan_create(cpsw->dma, *ch, handler, rx);
+		vch = rx ? *ch : 7 - *ch;
+		vec[*ch].ch = cpdma_chan_create(cpsw->dma, vch, handler, rx);
 		queue = netdev_get_tx_queue(priv->ndev, *ch);
 		queue->tx_maxrate = 0;
 
@@ -2980,7 +2981,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	u32 slave_offset, sliver_offset, slave_size;
 	const struct soc_device_attribute *soc;
 	struct cpsw_common		*cpsw;
-	int ret = 0, i;
+	int ret = 0, i, ch;
 	int irq;
 
 	cpsw = devm_kzalloc(&pdev->dev, sizeof(struct cpsw_common), GFP_KERNEL);
@@ -3155,7 +3156,8 @@ static int cpsw_probe(struct platform_device *pdev)
 	if (soc)
 		cpsw->quirk_irq = 1;
 
-	cpsw->txv[0].ch = cpdma_chan_create(cpsw->dma, 0, cpsw_tx_handler, 0);
+	ch = cpsw->quirk_irq ? 0 : 7;
+	cpsw->txv[0].ch = cpdma_chan_create(cpsw->dma, ch, cpsw_tx_handler, 0);
 	if (IS_ERR(cpsw->txv[0].ch)) {
 		dev_err(priv->dev, "error initializing tx dma channel\n");
 		ret = PTR_ERR(cpsw->txv[0].ch);
-- 
2.17.0

^ permalink raw reply related

* [RFC PATCH 0/6] net: ethernet: ti: cpsw: add MQPRIO and CBS Qdisc offload
From: Ivan Khoronzhuk @ 2018-05-18 21:15 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, Ivan Khoronzhuk

This series adds MQPRIO and CBS Qdisc offload for TI cpsw driver.
It potentially can be used in audio video bridging (AVB) and time
sensitive networking (TSN).

Patchset was tested on AM572x EVM and BBB boards. Last patch from this
series adds detailed description of configuration with examples. For
consistency reasons, in role of talker and listener, tools from
patchset "TSN: Add qdisc based config interface for CBS" were used and
can be seen here: https://www.spinics.net/lists/netdev/msg460869.html

Based on net-next/master

Ivan Khoronzhuk (6):
  net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
  net: ethernet: ti: cpdma: fit rated channels in backward order
  net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
  net: ethernet: ti: cpsw: add CBS Qdisc offload
  net: ethernet: ti: cpsw: restore shaper configuration while down/up
  Documentation: networking: cpsw: add MQPRIO & CBS offload examples

 Documentation/networking/cpsw.txt       | 540 ++++++++++++++++++++++++
 drivers/net/ethernet/ti/cpsw.c          | 364 +++++++++++++++-
 drivers/net/ethernet/ti/davinci_cpdma.c |  31 +-
 3 files changed, 913 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/networking/cpsw.txt

-- 
2.17.0

^ permalink raw reply

* Re: [bpf-next PATCH v2 0/2] SK_MSG programs: read sock fields
From: Daniel Borkmann @ 2018-05-18 21:11 UTC (permalink / raw)
  To: John Fastabend, ast; +Cc: netdev
In-Reply-To: <20180517211452.14426.98480.stgit@john-Precision-Tower-5810>

On 05/17/2018 11:16 PM, John Fastabend wrote:
> In this series we add the ability for sk msg programs to read basic
> sock information about the sock they are attached to. The second
> patch adds the tests to the selftest test_verifier.
> 
> One obseration that I had from writing this seriess is lots of the
> ./net/core/filter.c code is almost duplicated across program types.
> I thought about building a template/macro that we could use as a
> single block of code to read sock data out for multiple programs,
> but I wasn't convinced it was worth it yet. The result was using a
> macro saved a couple lines of code per block but made the code
> a bit harder to read IMO. We can probably revisit the idea later
> if we get more duplication.
> 
> v2: add errstr field to negative test_verifier test cases to ensure
>     we get the expected err string back from the verifier.
> 
> ---
> 
> John Fastabend (2):
>       bpf: allow sk_msg programs to read sock fields
>       bpf: add sk_msg prog sk access tests to test_verifier
> 
> 
>  include/linux/filter.h                      |    1 
>  include/uapi/linux/bpf.h                    |    8 ++
>  kernel/bpf/sockmap.c                        |    1 
>  net/core/filter.c                           |  114 ++++++++++++++++++++++++++-
>  tools/include/uapi/linux/bpf.h              |    8 ++
>  tools/testing/selftests/bpf/test_verifier.c |  115 +++++++++++++++++++++++++++
>  6 files changed, 244 insertions(+), 3 deletions(-)
> 
> --
> Signature
> 

Applied to bpf-next, thanks John!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox