* [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Alexei Starovoitov @ 2019-08-27 20:52 UTC (permalink / raw)
To: luto; +Cc: davem, daniel, netdev, bpf, kernel-team, linux-api
Introduce CAP_BPF that allows loading all types of BPF programs,
create most map types, load BTF, iterate programs and maps.
CAP_BPF alone is not enough to attach or run programs.
Networking:
CAP_BPF and CAP_NET_ADMIN are necessary to:
- attach to cgroup-bpf hooks like INET_INGRESS, INET_SOCK_CREATE, INET4_CONNECT
- run networking bpf programs (like xdp, skb, flow_dissector)
Tracing:
CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1)
are necessary to:
- attach bpf program to raw tracepoint
- use bpf_trace_printk() in all program types (not only tracing programs)
- create bpf stackmap
To attach bpf to perf_events perf_event_open() needs to succeed as usual.
CAP_BPF controls BPF side.
CAP_NET_ADMIN controls intersection where BPF calls into networking.
perf_paranoid_tracepoint_raw controls intersection where BPF calls into tracing.
In the future CAP_TRACING could be introduced to control
creation of kprobe/uprobe and attaching bpf to perf_events.
In such case bpf_probe_read() thin wrapper would be controlled by CAP_BPF.
Whereas probe_read() would be controlled by CAP_TRACING.
CAP_TRACING would also control generic kprobe+probe_read.
CAP_BPF and CAP_TRACING would be necessary for tracing bpf programs
that want to use bpf_probe_read.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
I would prefer to introduce CAP_TRACING soon, since it
will make tracing and networking permission model symmetrical.
include/linux/filter.h | 1 +
include/uapi/linux/capability.h | 5 ++-
kernel/bpf/arraymap.c | 2 +-
kernel/bpf/cgroup.c | 2 +-
kernel/bpf/core.c | 10 ++++--
kernel/bpf/cpumap.c | 2 +-
kernel/bpf/hashtab.c | 4 +--
kernel/bpf/lpm_trie.c | 2 +-
kernel/bpf/queue_stack_maps.c | 2 +-
kernel/bpf/reuseport_array.c | 2 +-
kernel/bpf/stackmap.c | 2 +-
kernel/bpf/syscall.c | 32 ++++++++++-------
kernel/bpf/verifier.c | 4 +--
kernel/trace/bpf_trace.c | 2 +-
net/core/bpf_sk_storage.c | 2 +-
net/core/filter.c | 10 +++---
security/selinux/include/classmap.h | 4 +--
tools/testing/selftests/bpf/test_verifier.c | 39 ++++++++++++++++-----
18 files changed, 84 insertions(+), 43 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 92c6e31fb008..16cea50af014 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -857,6 +857,7 @@ static inline bool bpf_dump_raw_ok(void)
return kallsyms_show_value() == 1;
}
+bool cap_bpf_tracing(void);
struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
const struct bpf_insn *patch, u32 len);
int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..b3390f34c9f5 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,11 @@ struct vfs_ns_cap_data {
#define CAP_AUDIT_READ 37
+/* Allow bpf() syscall except attach and tracing */
-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_BPF 38
+
+#define CAP_LAST_CAP CAP_BPF
#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 1c65ce0098a9..045e30b7160d 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -73,7 +73,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
int ret, numa_node = bpf_map_attr_numa_node(attr);
u32 elem_size, index_mask, max_entries;
- bool unpriv = !capable(CAP_SYS_ADMIN);
+ bool unpriv = !capable(CAP_BPF);
u64 cost, array_size, mask64;
struct bpf_map_memory mem;
struct bpf_array *array;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 6a6a154cfa7b..97f733354421 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -795,7 +795,7 @@ cgroup_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
case BPF_FUNC_get_current_cgroup_id:
return &bpf_get_current_cgroup_id_proto;
case BPF_FUNC_trace_printk:
- if (capable(CAP_SYS_ADMIN))
+ if (cap_bpf_tracing())
return bpf_get_trace_printk_proto();
/* fall through */
default:
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 8191a7db2777..5756c8a56f44 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -646,7 +646,7 @@ static bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
void bpf_prog_kallsyms_add(struct bpf_prog *fp)
{
if (!bpf_prog_kallsyms_candidate(fp) ||
- !capable(CAP_SYS_ADMIN))
+ !capable(CAP_BPF))
return;
spin_lock_bh(&bpf_lock);
@@ -768,7 +768,7 @@ static int bpf_jit_charge_modmem(u32 pages)
{
if (atomic_long_add_return(pages, &bpf_jit_current) >
(bpf_jit_limit >> PAGE_SHIFT)) {
- if (!capable(CAP_SYS_ADMIN)) {
+ if (!capable(CAP_BPF)) {
atomic_long_sub(pages, &bpf_jit_current);
return -EPERM;
}
@@ -2104,6 +2104,12 @@ int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
DEFINE_STATIC_KEY_FALSE(bpf_stats_enabled_key);
EXPORT_SYMBOL(bpf_stats_enabled_key);
+bool cap_bpf_tracing(void)
+{
+ return capable(CAP_SYS_ADMIN) ||
+ (capable(CAP_BPF) && !perf_paranoid_tracepoint_raw());
+}
+
/* All definitions of tracepoints related to BPF. */
#define CREATE_TRACE_POINTS
#include <linux/bpf_trace.h>
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index ef49e17ae47c..ca483c9a9c2e 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -84,7 +84,7 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
int ret, cpu;
u64 cost;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return ERR_PTR(-EPERM);
/* check sanity of attributes */
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22066a62c8c9..f459315625ac 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -244,9 +244,9 @@ static int htab_map_alloc_check(union bpf_attr *attr)
BUILD_BUG_ON(offsetof(struct htab_elem, fnode.next) !=
offsetof(struct htab_elem, hash_node.pprev));
- if (lru && !capable(CAP_SYS_ADMIN))
+ if (lru && !capable(CAP_BPF))
/* LRU implementation is much complicated than other
- * maps. Hence, limit to CAP_SYS_ADMIN for now.
+ * maps. Hence, limit to CAP_BPF.
*/
return -EPERM;
diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 56e6c75d354d..a45fa5464d98 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -543,7 +543,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
u64 cost = sizeof(*trie), cost_per_node;
int ret;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return ERR_PTR(-EPERM);
/* check sanity of attributes */
diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
index f697647ceb54..ca0ba9edca86 100644
--- a/kernel/bpf/queue_stack_maps.c
+++ b/kernel/bpf/queue_stack_maps.c
@@ -45,7 +45,7 @@ static bool queue_stack_map_is_full(struct bpf_queue_stack *qs)
/* Called from syscall */
static int queue_stack_map_alloc_check(union bpf_attr *attr)
{
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
/* check sanity of attributes */
diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index 50c083ba978c..bfad7d41a061 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -154,7 +154,7 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
struct bpf_map_memory mem;
u64 array_size;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return ERR_PTR(-EPERM);
array_size = sizeof(*array);
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 052580c33d26..c540b2b3fc4a 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -90,7 +90,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
u64 cost, n_buckets;
int err;
- if (!capable(CAP_SYS_ADMIN))
+ if (!cap_bpf_tracing())
return ERR_PTR(-EPERM);
if (attr->map_flags & ~STACK_CREATE_FLAG_MASK)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index c0f62fd67c6b..ef7b06ca30e5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1176,7 +1176,7 @@ static int map_freeze(const union bpf_attr *attr)
err = -EBUSY;
goto err_put;
}
- if (!capable(CAP_SYS_ADMIN)) {
+ if (!capable(CAP_BPF)) {
err = -EPERM;
goto err_put;
}
@@ -1634,7 +1634,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
(attr->prog_flags & BPF_F_ANY_ALIGNMENT) &&
- !capable(CAP_SYS_ADMIN))
+ !capable(CAP_BPF))
return -EPERM;
/* copy eBPF program license from user space */
@@ -1647,11 +1647,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
is_gpl = license_is_gpl_compatible(license);
if (attr->insn_cnt == 0 ||
- attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
+ attr->insn_cnt > (capable(CAP_BPF) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
return -E2BIG;
if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
type != BPF_PROG_TYPE_CGROUP_SKB &&
- !capable(CAP_SYS_ADMIN))
+ !capable(CAP_BPF))
return -EPERM;
bpf_prog_load_fixup_attach_type(attr);
@@ -1802,6 +1802,9 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
char tp_name[128];
int tp_fd, err;
+ if (!cap_bpf_tracing())
+ return -EPERM;
+
if (strncpy_from_user(tp_name, u64_to_user_ptr(attr->raw_tracepoint.name),
sizeof(tp_name) - 1) < 0)
return -EFAULT;
@@ -2080,7 +2083,10 @@ static int bpf_prog_test_run(const union bpf_attr *attr,
struct bpf_prog *prog;
int ret = -ENOTSUPP;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_NET_ADMIN) || !capable(CAP_BPF))
+ /* test_run callback is available for networking progs only.
+ * Add cap_bpf_tracing() above when tracing progs become runable.
+ */
return -EPERM;
if (CHECK_ATTR(BPF_PROG_TEST_RUN))
return -EINVAL;
@@ -2117,7 +2123,7 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr,
if (CHECK_ATTR(BPF_OBJ_GET_NEXT_ID) || next_id >= INT_MAX)
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
next_id++;
@@ -2143,7 +2149,7 @@ static int bpf_prog_get_fd_by_id(const union bpf_attr *attr)
if (CHECK_ATTR(BPF_PROG_GET_FD_BY_ID))
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
spin_lock_bh(&prog_idr_lock);
@@ -2177,7 +2183,7 @@ static int bpf_map_get_fd_by_id(const union bpf_attr *attr)
attr->open_flags & ~BPF_OBJ_FLAG_MASK)
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
f_flags = bpf_get_file_flag(attr->open_flags);
@@ -2352,7 +2358,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
info.run_time_ns = stats.nsecs;
info.run_cnt = stats.cnt;
- if (!capable(CAP_SYS_ADMIN)) {
+ if (!capable(CAP_BPF)) {
info.jited_prog_len = 0;
info.xlated_prog_len = 0;
info.nr_jited_ksyms = 0;
@@ -2670,7 +2676,7 @@ static int bpf_btf_load(const union bpf_attr *attr)
if (CHECK_ATTR(BPF_BTF_LOAD))
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
return btf_new_fd(attr);
@@ -2683,7 +2689,7 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
if (CHECK_ATTR(BPF_BTF_GET_FD_BY_ID))
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
return btf_get_fd_by_id(attr->btf_id);
@@ -2752,7 +2758,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
if (CHECK_ATTR(BPF_TASK_FD_QUERY))
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!cap_bpf_tracing())
return -EPERM;
if (attr->task_fd_query.flags != 0)
@@ -2820,7 +2826,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
union bpf_attr attr = {};
int err;
- if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
+ if (sysctl_unprivileged_bpf_disabled && !capable(CAP_BPF))
return -EPERM;
err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 10c0ff93f52b..5810e8cc9342 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -987,7 +987,7 @@ static void __mark_reg_unbounded(struct bpf_reg_state *reg)
reg->umax_value = U64_MAX;
/* constant backtracking is enabled for root only for now */
- reg->precise = capable(CAP_SYS_ADMIN) ? false : true;
+ reg->precise = capable(CAP_BPF) ? false : true;
}
/* Mark a register as having a completely unknown (scalar) value. */
@@ -9233,7 +9233,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
env->insn_aux_data[i].orig_idx = i;
env->prog = *prog;
env->ops = bpf_verifier_ops[env->prog->type];
- is_priv = capable(CAP_SYS_ADMIN);
+ is_priv = capable(CAP_BPF);
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ca1255d14576..2bf58ff5bf75 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1246,7 +1246,7 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info)
u32 *ids, prog_cnt, ids_len;
int ret;
- if (!capable(CAP_SYS_ADMIN))
+ if (!cap_bpf_tracing())
return -EPERM;
if (event->attr.type != PERF_TYPE_TRACEPOINT)
return -EINVAL;
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index da5639a5bd3b..0b29f6abbeba 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -616,7 +616,7 @@ static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
!attr->btf_key_type_id || !attr->btf_value_type_id)
return -EINVAL;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return -EPERM;
if (attr->value_size >= KMALLOC_MAX_SIZE -
diff --git a/net/core/filter.c b/net/core/filter.c
index 0c1059cdad3d..986277abfde2 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5990,7 +5990,7 @@ bpf_base_func_proto(enum bpf_func_id func_id)
break;
}
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return NULL;
switch (func_id) {
@@ -5999,7 +5999,9 @@ bpf_base_func_proto(enum bpf_func_id func_id)
case BPF_FUNC_spin_unlock:
return &bpf_spin_unlock_proto;
case BPF_FUNC_trace_printk:
- return bpf_get_trace_printk_proto();
+ if (cap_bpf_tracing())
+ return bpf_get_trace_printk_proto();
+ /* fall through */
default:
return NULL;
}
@@ -6563,7 +6565,7 @@ static bool cg_skb_is_valid_access(int off, int size,
return false;
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range(struct __sk_buff, data_end):
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return false;
break;
}
@@ -6575,7 +6577,7 @@ static bool cg_skb_is_valid_access(int off, int size,
case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]):
break;
case bpf_ctx_range(struct __sk_buff, tstamp):
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_BPF))
return false;
break;
default:
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 201f7e588a29..1c925bc04072 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -26,9 +26,9 @@
"audit_control", "setfcap"
#define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \
- "wake_alarm", "block_suspend", "audit_read"
+ "wake_alarm", "block_suspend", "audit_read", "bpf"
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_BPF
#error New capability defined, please update COMMON_CAP2_PERMS.
#endif
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 44e2d640b088..b31b961f1020 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -805,10 +805,18 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
}
}
+struct libcap {
+ struct __user_cap_header_struct hdr;
+ struct __user_cap_data_struct data[2];
+};
+
static int set_admin(bool admin)
{
cap_t caps;
- const cap_value_t cap_val = CAP_SYS_ADMIN;
+ /* need CAP_BPF to load progs and CAP_NET_ADMIN to run networking progs */
+ const cap_value_t cap_val[] = {38/*CAP_BPF*/, CAP_NET_ADMIN};
+ const cap_value_t cap_val_admin = CAP_SYS_ADMIN;
+ struct libcap *cap;
int ret = -1;
caps = cap_get_proc();
@@ -816,11 +824,23 @@ static int set_admin(bool admin)
perror("cap_get_proc");
return -1;
}
- if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val,
+ cap = (struct libcap *)caps;
+ if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val_admin, CAP_CLEAR)) {
+ perror("cap_set_flag clear admin");
+ goto out;
+ }
+ if (cap_set_flag(caps, CAP_EFFECTIVE, 2, cap_val,
admin ? CAP_SET : CAP_CLEAR)) {
- perror("cap_set_flag");
+ perror("cap_set_flag set_or_clear bpf+net");
goto out;
}
+ /* libcap is likely old and simply ignores CAP_BPF,
+ * so update effective bits manually
+ */
+ if (admin)
+ cap->data[1].effective |= 1 << (38 - 32);
+ else
+ cap->data[1].effective &= ~(1 << (38 - 32));
if (cap_set_proc(caps)) {
perror("cap_set_proc");
goto out;
@@ -1013,8 +1033,9 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
static bool is_admin(void)
{
cap_t caps;
- cap_flag_value_t sysadmin = CAP_CLEAR;
- const cap_value_t cap_val = CAP_SYS_ADMIN;
+ cap_flag_value_t bpf_priv = CAP_CLEAR;
+ cap_flag_value_t net_priv = CAP_CLEAR;
+ struct libcap *cap;
#ifdef CAP_IS_SUPPORTED
if (!CAP_IS_SUPPORTED(CAP_SETFCAP)) {
@@ -1027,11 +1048,13 @@ static bool is_admin(void)
perror("cap_get_proc");
return false;
}
- if (cap_get_flag(caps, cap_val, CAP_EFFECTIVE, &sysadmin))
- perror("cap_get_flag");
+ cap = (struct libcap *)caps;
+ bpf_priv = cap->data[1].effective & (1 << (38/* CAP_BPF */ - 32));
+ if (cap_get_flag(caps, CAP_NET_ADMIN, CAP_EFFECTIVE, &net_priv))
+ perror("cap_get_flag NET");
if (cap_free(caps))
perror("cap_free");
- return (sysadmin == CAP_SET);
+ return bpf_priv == CAP_SET && net_priv == CAP_SET;
}
static void get_unpriv_disabled()
--
2.20.0
^ permalink raw reply related
* Re: [PATCH net] tcp: inherit timestamp on mtu probe
From: Willem de Bruijn @ 2019-08-27 20:53 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Willem de Bruijn, netdev, David Miller, Jakub Kicinski
In-Reply-To: <CANn89iKwaar9fmgfoDTKebfRGHjR2K3gLeeJCr-bvturzgj3zQ@mail.gmail.com>
On Tue, Aug 27, 2019 at 4:07 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Aug 27, 2019 at 9:09 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > From: Willem de Bruijn <willemb@google.com>
> >
> > TCP associates tx timestamp requests with a byte in the bytestream.
> > If merging skbs in tcp_mtu_probe, migrate the tstamp request.
> >
> > Similar to MSG_EOR, do not allow moving a timestamp from any segment
> > in the probe but the last. This to avoid merging multiple timestamps.
> >
> > Tested with the packetdrill script at
> > https://github.com/wdebruij/packetdrill/commits/mtu_probe-1
> >
> > Link: http://patchwork.ozlabs.org/patch/1143278/#2232897
> > Fixes: 4ed2d765dfac ("net-timestamp: TCP timestamping")
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > ---
> > net/ipv4/tcp_output.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index 5c46bc4c7e8d..42abc9bd687a 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -2053,7 +2053,7 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
> > if (len <= skb->len)
> > break;
> >
> > - if (unlikely(TCP_SKB_CB(skb)->eor))
> > + if (unlikely(TCP_SKB_CB(skb)->eor) || tcp_has_tx_tstamp(skb))
> > return false;
> >
> > len -= skb->len;
> > @@ -2170,6 +2170,7 @@ static int tcp_mtu_probe(struct sock *sk)
> > * we need to propagate it to the new skb.
> > */
> > TCP_SKB_CB(nskb)->eor = TCP_SKB_CB(skb)->eor;
> > + tcp_skb_collapse_tstamp(nskb, skb);
>
> nit: maybe rename tcp_skb_collapse_tstamp() to tcp_skb_tstamp_copy()
> or something ?
>
> Its name came from the fact that it was only used from
> tcp_collapse_retrans(), but it will no
> longer be the case after your fix.
Sure, that's more descriptive.
One caveat, the function is exposed in a header, so it's a
bit more churn. If you don't mind that, I'll send the v2.
^ permalink raw reply
* RE: [PATCH 3/4] net: dsa: microchip: fix interrupt mask
From: Tristram.Ha @ 2019-08-27 20:56 UTC (permalink / raw)
To: Razvan.Stefanescu, andrew
Cc: Woojung.Huh, UNGLinuxDriver, vivien.didelot, f.fainelli, davem,
netdev, linux-kernel
In-Reply-To: <f54e1c98-e2db-2c63-4bd9-d1576f94937b@microchip.com>
> On 27/08/2019 15:51, Andrew Lunn wrote:
> >
> > On Tue, Aug 27, 2019 at 12:31:09PM +0300, Razvan Stefanescu wrote:
> >> Global Interrupt Mask Register comprises of Lookup Engine (LUE)
> Interrupt
> >> Mask (bit 31) and GPIO Pin Output Trigger and Timestamp Unit Interrupt
> >> Mask (bit 29).
> >>
> >> This corrects LUE bit.
> >
> > Hi Razvan
> >
> > Is this a fix? Something that should be back ported to old kernels?
>
> Hello,
>
> During testing I did not observed any issues with the old value. So I am
> not sure how the switch is affected by the incorrect setting.
>
> Maybe maintainers will be able to make a better assessment if this needs
> back-porting. And I will be happy to do it if it is necessary.
>
I do not think the change has any effect as the interrupt handling is not implemented in the driver, unless I am mistaken and do not know about the new code.
Currently those 3 interrupts do not do anything that are required in normal operation.
The first one LUE_INT notifies the driver when there are learn/write fails in the MAC table. This condition rarely happens unless the switch is going through stress test. When this interrupt happens software cannot do anything to resolve the issue. It may become a denial of service if the MAC table keeps triggering learn fail.
The second one is used by PTP code, which is not implemented.
The third one is triggered when register access space does not exist. It is useful during development so driver knows it is accessing the wrong register. It can also become a denial of service if someone keeps accessing wrong registers. But then that person can do anything with the chip.
^ permalink raw reply
* Re: [PATCH net] tcp: inherit timestamp on mtu probe
From: Eric Dumazet @ 2019-08-27 20:58 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: netdev, David Miller, Jakub Kicinski
In-Reply-To: <CA+FuTSfK=xSMJvVNJB7DKdqwG_FAi2gLjbCvkXVqF99n71rRdg@mail.gmail.com>
On Tue, Aug 27, 2019 at 10:54 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> Sure, that's more descriptive.
>
> One caveat, the function is exposed in a header, so it's a
> bit more churn. If you don't mind that, I'll send the v2.
Oh right it is also used from tcp_shifted_skb() after Martin KaFai Lau fix ...
^ permalink raw reply
* RE: [PATCH 4/4] net: dsa: microchip: avoid hard-codded port count
From: Tristram.Ha @ 2019-08-27 21:03 UTC (permalink / raw)
To: Razvan.Stefanescu
Cc: netdev, linux-kernel, Razvan.Stefanescu, Woojung.Huh,
vivien.didelot, UNGLinuxDriver, andrew, f.fainelli, davem
In-Reply-To: <20190827093110.14957-5-razvan.stefanescu@microchip.com>
> Subject: [PATCH 4/4] net: dsa: microchip: avoid hard-codded port count
>
> Use port_cnt value to disable interrupts on switch reset.
>
> Signed-off-by: Razvan Stefanescu <razvan.stefanescu@microchip.com>
> ---
> drivers/net/dsa/microchip/ksz9477.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/dsa/microchip/ksz9477.c
> b/drivers/net/dsa/microchip/ksz9477.c
> index 187be42de5f1..54fc05595d48 100644
> --- a/drivers/net/dsa/microchip/ksz9477.c
> +++ b/drivers/net/dsa/microchip/ksz9477.c
> @@ -213,7 +213,7 @@ static int ksz9477_reset_switch(struct ksz_device
> *dev)
>
> /* disable interrupts */
> ksz_write32(dev, REG_SW_INT_MASK__4, SWITCH_INT_MASK);
> - ksz_write32(dev, REG_SW_PORT_INT_MASK__4, 0x7F);
> + ksz_write32(dev, REG_SW_PORT_INT_MASK__4, dev->port_cnt);
> ksz_read32(dev, REG_SW_PORT_INT_STATUS__4, &data32);
>
> /* set broadcast storm protection 10% rate */
The register value is a portmap, so using port_cnt may be wrong.
The chip is a 7-port switch. There is a 6-port variant, but it is okay to write 0x7F.
There is also a 3-port variant which uses a different design. It is a bit of stretch to use 0x7F on it.
It is more a code readability or correctness than incorrect hardware operation.
^ permalink raw reply
* Re: Unable to create htb tc classes more than 64K
From: Eric Dumazet @ 2019-08-27 21:09 UTC (permalink / raw)
To: Dave Taht, Eric Dumazet
Cc: Cong Wang, Akshat Kakkar, Anton Danilov, NetFilter, lartc, netdev
In-Reply-To: <CAA93jw6TWUmqsvBDT4tFPgwjGxAmm_S5bUibj16nwp1F=AwyRA@mail.gmail.com>
On 8/27/19 10:53 PM, Dave Taht wrote:
>
> Although this is very cool, I think in this case the OP is being
> a router, not server?
This mechanism is generic. EDT has not been designed for servers only.
One HTB class (with one associated qdisc per leaf) per rate limiter
does not scale, and consumes a _lot_ more memory.
We have abandoned HTB at Google for these reasons.
Nice thing with EDT is that you can stack arbitrary number of rate limiters,
and still keep a single queue (in FQ or another layer downstream)
^ permalink raw reply
* Re: [PATCH -next] net: mlx5: Kconfig: Fix MLX5_CORE_EN dependencies
From: Saeed Mahameed @ 2019-08-27 21:15 UTC (permalink / raw)
To: wharms@bfs.de, maowenan@huawei.com
Cc: linux-rdma@vger.kernel.org, kernel-janitors@vger.kernel.org,
davem@davemloft.net, netdev@vger.kernel.org, leon@kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <092f56bf-3e94-a3b6-926c-da33ba26ee37@huawei.com>
On Tue, 2019-08-27 at 17:51 +0800, maowenan wrote:
>
> On 2019/8/27 15:24, walter harms wrote:
> >
> > Am 27.08.2019 05:12, schrieb Mao Wenan:
> > > When MLX5_CORE_EN=y and PCI_HYPERV_INTERFACE is not set, below
> > > errors are found:
> > > drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function
> > > `mlx5e_nic_enable':
> > > en_main.c:(.text+0xb649): undefined reference to
> > > `mlx5e_hv_vhca_stats_create'
> > > drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function
> > > `mlx5e_nic_disable':
> > > en_main.c:(.text+0xb8c4): undefined reference to
> > > `mlx5e_hv_vhca_stats_destroy'
> > >
> > > This because CONFIG_PCI_HYPERV_INTERFACE is newly introduced by
> > > 'commit 348dd93e40c1
> > > ("PCI: hv: Add a Hyper-V PCI interface driver for software
> > > backchannel interface"),
> > > Fix this by making MLX5_CORE_EN imply PCI_HYPERV_INTERFACE.
> > >
> > > Fixes: cef35af34d6d ("net/mlx5e: Add mlx5e HV VHCA stats agent")
> > > Signed-off-by: Mao Wenan <maowenan@huawei.com>
> > > ---
> > > drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > > b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > > index 37fef8c..a6a70ce 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > > @@ -35,6 +35,7 @@ config MLX5_CORE_EN
> > > depends on IPV6=y || IPV6=n || MLX5_CORE=m
> >
> > OT but ...
> > is that IPV6 needed at all ? can there be something else that yes
> > or no ?
only needed for en_rep.c/en_tc.c which are only compiled when
MLX5_ESWITCH is selected, so actually such condition should be for
MLX5_ESWITCH and not MLX5_CORE_EN
tested with:
MLX5_CORE=y
MLX5_CORE_EN=y
IPV6=m
and removed the dependency.
so if ipv6 is a module but mlx5 is builtin this will happen..
ld: drivers/net/ethernet/mellanox/mlx5/core/en_rep.o: in function
`mlx5e_rep_neigh_update_init_interval':
/home/saeedm/devel/linux/drivers/net/ethernet/mellanox/mlx5/core/en_rep
.c:505: undefined reference to `nd_tbl'
ld: drivers/net/ethernet/mellanox/mlx5/core/en_rep.o: in function
`mlx5e_rep_netevent_event':
/home/saeedm/devel/linux/drivers/net/ethernet/mellanox/mlx5/core/en_rep
.c:946: undefined reference to `nd_tbl'
ld:
/home/saeedm/devel/linux/drivers/net/ethernet/mellanox/mlx5/core/en_rep
.c:919: undefined reference to `nd_tbl'
ld: drivers/net/ethernet/mellanox/mlx5/core/en_tc.o: in function
`mlx5e_tc_update_neigh_used_value':
/home/saeedm/devel/linux/drivers/net/ethernet/mellanox/mlx5/core/en_tc.
c:1497: undefined reference to `nd_tbl'
the problem is that mlx5_core can't be builtin if ipv6 is a module due
to this nd_tbl dependency
I think this is solvable by using ipv6_stub->nd_tbl, instead of
referencing md_tbl directly from mlx5.
>
> If I set IPV6=m, errors are found as below:
> drivers/net/ethernet/mellanox/mlx5/core/main.o: In function
> `mlx5_unload':
> main.c:(.text+0x275): undefined reference to `mlx5_hv_vhca_cleanup'
> drivers/net/ethernet/mellanox/mlx5/core/main.o: In function
> `mlx5_cleanup_once':
> main.c:(.text+0x2e8): undefined reference to `mlx5_hv_vhca_destroy'
> drivers/net/ethernet/mellanox/mlx5/core/main.o: In function
> `mlx5_load_one':
> main.c:(.text+0x23c1): undefined reference to `mlx5_hv_vhca_create'
> main.c:(.text+0x248f): undefined reference to `mlx5_hv_vhca_init'
> main.c:(.text+0x25e0): undefined reference to `mlx5_hv_vhca_cleanup
this is not related, i think there is something wrong with your local
repository.
^ permalink raw reply
* Re: [PATCH net] tcp: inherit timestamp on mtu probe
From: Willem de Bruijn @ 2019-08-27 21:15 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, David Miller, Jakub Kicinski
In-Reply-To: <CANn89i++59nk_RFMOgor6XL3ZZY7t9QLa70sppKe6eQBrObagQ@mail.gmail.com>
On Tue, Aug 27, 2019 at 4:58 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Aug 27, 2019 at 10:54 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>
> > Sure, that's more descriptive.
> >
> > One caveat, the function is exposed in a header, so it's a
> > bit more churn. If you don't mind that, I'll send the v2.
>
> Oh right it is also used from tcp_shifted_skb() after Martin KaFai Lau fix ...
Leave as is then?
^ permalink raw reply
* [PATCH net] net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue
From: Davide Caratti @ 2019-08-27 21:18 UTC (permalink / raw)
To: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David S. Miller, netdev
Cc: Paolo Abeni, Stefano Brivio, Li Shuang
Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
'TCQ_F_NOLOCK' bit in the parent qdisc, we can't assume anymore that
per-cpu counters are there in the error path of skb_array_produce().
Otherwise, the following splat can be seen:
Unable to handle kernel paging request at virtual address 0000600dea430008
Mem abort info:
ESR = 0x96000005
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000005
CM = 0, WnR = 0
user pgtable: 64k pages, 48-bit VAs, pgdp = 000000007b97530e
[0000600dea430008] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] SMP
[...]
pstate: 10000005 (nzcV daif -PAN -UAO)
pc : pfifo_fast_enqueue+0x524/0x6e8
lr : pfifo_fast_enqueue+0x46c/0x6e8
sp : ffff800d39376fe0
x29: ffff800d39376fe0 x28: 1ffff001a07d1e40
x27: ffff800d03e8f188 x26: ffff800d03e8f200
x25: 0000000000000062 x24: ffff800d393772f0
x23: 0000000000000000 x22: 0000000000000403
x21: ffff800cca569a00 x20: ffff800d03e8ee00
x19: ffff800cca569a10 x18: 00000000000000bf
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: ffff1001a726edd0
x13: 1fffe4000276a9a4 x12: 0000000000000000
x11: dfff200000000000 x10: ffff800d03e8f1a0
x9 : 0000000000000003 x8 : 0000000000000000
x7 : 00000000f1f1f1f1 x6 : ffff1001a726edea
x5 : ffff800cca56a53c x4 : 1ffff001bf9a8003
x3 : 1ffff001bf9a8003 x2 : 1ffff001a07d1dcb
x1 : 0000600dea430000 x0 : 0000600dea430008
Process ping (pid: 6067, stack limit = 0x00000000dc0aa557)
Call trace:
pfifo_fast_enqueue+0x524/0x6e8
htb_enqueue+0x660/0x10e0 [sch_htb]
__dev_queue_xmit+0x123c/0x2de0
dev_queue_xmit+0x24/0x30
ip_finish_output2+0xc48/0x1720
ip_finish_output+0x548/0x9d8
ip_output+0x334/0x788
ip_local_out+0x90/0x138
ip_send_skb+0x44/0x1d0
ip_push_pending_frames+0x5c/0x78
raw_sendmsg+0xed8/0x28d0
inet_sendmsg+0xc4/0x5c0
sock_sendmsg+0xac/0x108
__sys_sendto+0x1ac/0x2a0
__arm64_sys_sendto+0xc4/0x138
el0_svc_handler+0x13c/0x298
el0_svc+0x8/0xc
Code: f9402e80 d538d081 91002000 8b010000 (885f7c03)
Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
before dereferencing 'qdisc->cpu_qstats'.
Fixes: 8a53e616de29 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
CC: Paolo Abeni <pabeni@redhat.com>
CC: Stefano Brivio <sbrivio@redhat.com>
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
net/sched/sch_generic.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 099797e5409d..137db1cbde85 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -624,8 +624,12 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
err = skb_array_produce(q, skb);
- if (unlikely(err))
- return qdisc_drop_cpu(skb, qdisc, to_free);
+ if (unlikely(err)) {
+ if (qdisc_is_percpu_stats(qdisc))
+ return qdisc_drop_cpu(skb, qdisc, to_free);
+ else
+ return qdisc_drop(skb, qdisc, to_free);
+ }
qdisc_update_stats_at_enqueue(qdisc, pkt_len);
return NET_XMIT_SUCCESS;
--
2.20.1
^ permalink raw reply related
* Re: [PATCH v2 1/1] netfilter: nf_tables: fib: Drop IPV6 packages if IPv6 is disabled on boot
From: David Miller @ 2019-08-27 21:19 UTC (permalink / raw)
To: leonardo
Cc: pablo, netfilter-devel, coreteam, netdev, linux-kernel, kadlec,
fw, kuznet, yoshfuji
In-Reply-To: <77c43754ff72e9a2e8048ccd032351cf0186080a.camel@linux.ibm.com>
From: Leonardo Bras <leonardo@linux.ibm.com>
Date: Tue, 27 Aug 2019 14:34:14 -0300
> I could reproduce this bug on a host ('ipv6.disable=1') starting a
> guest with a virtio-net interface with 'filterref' over a virtual
> bridge. It crashes the host during guest boot (just before login).
>
> By that I could understand that a guest IPv6 network traffic
> (viavirtio-net) may cause this kernel panic.
Really this is bad and I suspected bridging to be involved somehow.
If ipv6 is disabled ipv6 traffic should not pass through the machine
by any means whatsoever. Otherwise there is no point to the knob
and we will keep having to add hack checks all over the tree instead
of fixing the fundamental issue.
^ permalink raw reply
* Re: [PATCH net] ipv6: Default fib6_type to RTN_UNICAST when not set
From: David Miller @ 2019-08-27 21:21 UTC (permalink / raw)
To: dsahern; +Cc: Joakim.Tjernlund, netdev, greg, stable
In-Reply-To: <b644d367-53a3-c2cc-2a84-28a7caae480c@gmail.com>
From: David Ahern <dsahern@gmail.com>
Date: Tue, 27 Aug 2019 11:51:32 -0600
> Specific request is for commit c7036d97acd2527cef145b5ef9ad1a37ed21bbe6
> ("ipv6: Default fib6_type to RTN_UNICAST when not set") to be queued for
> stable releases prior to v5.2
Ok, I'll take care of this in my next round.
^ permalink raw reply
* Re: [PATCH v5 net-next 02/18] ionic: Add hardware init and device commands
From: Shannon Nelson @ 2019-08-27 21:22 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190826212404.77348857@cakuba.netronome.com>
On 8/26/19 9:24 PM, Jakub Kicinski wrote:
> On Mon, 26 Aug 2019 14:33:23 -0700, Shannon Nelson wrote:
>> The ionic device has a small set of PCI registers, including a
>> device control and data space, and a large set of message
>> commands.
>>
>> Signed-off-by: Shannon Nelson <snelson@pensando.io>
>> diff --git a/drivers/net/ethernet/pensando/ionic/Makefile b/drivers/net/ethernet/pensando/ionic/Makefile
>> index f174e8f7bce1..a23d58519c63 100644
>> --- a/drivers/net/ethernet/pensando/ionic/Makefile
>> +++ b/drivers/net/ethernet/pensando/ionic/Makefile
>> @@ -3,4 +3,5 @@
>>
>> obj-$(CONFIG_IONIC) := ionic.o
>>
>> -ionic-y := ionic_main.o ionic_bus_pci.o ionic_devlink.o
>> +ionic-y := ionic_main.o ionic_bus_pci.o ionic_devlink.o ionic_dev.o \
>> + ionic_debugfs.o
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic.h b/drivers/net/ethernet/pensando/ionic/ionic.h
>> index d40077161214..1f3c4a916849 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic.h
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic.h
>> @@ -4,6 +4,10 @@
>> #ifndef _IONIC_H_
>> #define _IONIC_H_
>>
>> +#include "ionic_if.h"
>> +#include "ionic_dev.h"
>> +#include "ionic_devlink.h"
>> +
>> #define IONIC_DRV_NAME "ionic"
>> #define IONIC_DRV_DESCRIPTION "Pensando Ethernet NIC Driver"
>> #define IONIC_DRV_VERSION "0.15.0-k"
>> @@ -17,10 +21,27 @@
>> #define IONIC_SUBDEV_ID_NAPLES_100_4 0x4001
>> #define IONIC_SUBDEV_ID_NAPLES_100_8 0x4002
>>
>> +#define devcmd_timeout 10
> nit: upper case?
Sure
>
>> struct ionic {
>> struct pci_dev *pdev;
>> struct device *dev;
>> struct devlink *dl;
>> + struct devlink_port dl_port;
> devlink_port is not used in this patch
I can move that to patch 09 where it gets used
>
>> + struct ionic_dev idev;
>> + struct mutex dev_cmd_lock; /* lock for dev_cmd operations */
>> + struct dentry *dentry;
>> + struct ionic_dev_bar bars[IONIC_BARS_MAX];
>> + unsigned int num_bars;
>> + struct ionic_identity ident;
>> };
>> + err = ionic_init(ionic);
>> + if (err) {
>> + dev_err(dev, "Cannot init device: %d, aborting\n", err);
>> + goto err_out_teardown;
>> + }
>> +
>> + err = ionic_devlink_register(ionic);
>> + if (err)
>> + dev_err(dev, "Cannot register devlink: %d\n", err);
>>
>> return 0;
>> +
>> +err_out_teardown:
>> + ionic_dev_teardown(ionic);
>> +err_out_unmap_bars:
>> + ionic_unmap_bars(ionic);
>> + pci_release_regions(pdev);
>> +err_out_pci_clear_master:
>> + pci_clear_master(pdev);
>> +err_out_pci_disable_device:
>> + pci_disable_device(pdev);
>> +err_out_debugfs_del_dev:
>> + ionic_debugfs_del_dev(ionic);
>> +err_out_clear_drvdata:
>> + mutex_destroy(&ionic->dev_cmd_lock);
>> + ionic_devlink_free(ionic);
>> +
>> + return err;
>> }
>>
>> static void ionic_remove(struct pci_dev *pdev)
>> {
>> struct ionic *ionic = pci_get_drvdata(pdev);
>>
>> + if (!ionic)
> How can this be NULL? Usually if this is NULL that means probe()
> failed but 'err' was not set properly. Perhaps WARN_ON() here?
Yes, pretty unlikely, but seems worth checking.
>
>> + return;
>> +
>> + ionic_devlink_unregister(ionic);
>> + ionic_reset(ionic);
>> + ionic_dev_teardown(ionic);
>> + ionic_unmap_bars(ionic);
>> + pci_release_regions(pdev);
>> + pci_clear_master(pdev);
>> + pci_disable_device(pdev);
>> + ionic_debugfs_del_dev(ionic);
>> + mutex_destroy(&ionic->dev_cmd_lock);
>> ionic_devlink_free(ionic);
>> }
>>
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> new file mode 100644
>> index 000000000000..30a5206bba4e
>> --- /dev/null
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> @@ -0,0 +1,143 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2017 - 2019 Pensando Systems, Inc */
>> +
>> +#ifndef _IONIC_DEV_H_
>> +#define _IONIC_DEV_H_
>> +
>> +#include <linux/mutex.h>
>> +#include <linux/workqueue.h>
>> +
>> +#include "ionic_if.h"
>> +#include "ionic_regs.h"
>> +
>> +struct ionic_dev_bar {
>> + void __iomem *vaddr;
>> + phys_addr_t bus_addr;
>> + unsigned long len;
>> + int res_index;
>> +};
>> +
>> +static inline void ionic_struct_size_checks(void)
>> +{
>> + /* Registers */
>> + BUILD_BUG_ON(sizeof(struct ionic_intr) != 32);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_doorbell) != 8);
>> + BUILD_BUG_ON(sizeof(struct ionic_intr_status) != 8);
>> +
>> + BUILD_BUG_ON(sizeof(union ionic_dev_regs) != 4096);
>> + BUILD_BUG_ON(sizeof(union ionic_dev_info_regs) != 2048);
>> + BUILD_BUG_ON(sizeof(union ionic_dev_cmd_regs) != 2048);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_stats) != 1024);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_admin_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_admin_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_nop_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_nop_comp) != 16);
>> +
>> + /* Device commands */
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_identify_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_identify_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_init_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_init_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_reset_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_reset_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_getattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_getattr_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_setattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_dev_setattr_comp) != 16);
>> +
>> + /* Port commands */
>> + BUILD_BUG_ON(sizeof(struct ionic_port_identify_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_identify_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_init_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_init_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_reset_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_reset_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_getattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_getattr_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_setattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_port_setattr_comp) != 16);
>> +
>> + /* LIF commands */
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_init_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_init_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_reset_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(ionic_lif_reset_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_getattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_getattr_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_setattr_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_lif_setattr_comp) != 16);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_q_init_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_q_init_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_q_control_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(ionic_q_control_comp) != 16);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_rx_mode_set_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(ionic_rx_mode_set_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_rx_filter_add_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_rx_filter_add_comp) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_rx_filter_del_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(ionic_rx_filter_del_comp) != 16);
>> +
>> + /* RDMA commands */
>> + BUILD_BUG_ON(sizeof(struct ionic_rdma_reset_cmd) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_rdma_queue_cmd) != 64);
>> +
>> + /* Events */
>> + BUILD_BUG_ON(sizeof(struct ionic_notifyq_cmd) != 4);
>> + BUILD_BUG_ON(sizeof(union ionic_notifyq_comp) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_notifyq_event) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_link_change_event) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_reset_event) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_heartbeat_event) != 64);
>> + BUILD_BUG_ON(sizeof(struct ionic_log_event) != 64);
>> +
>> + /* I/O */
>> + BUILD_BUG_ON(sizeof(struct ionic_txq_desc) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_txq_sg_desc) != 128);
>> + BUILD_BUG_ON(sizeof(struct ionic_txq_comp) != 16);
>> +
>> + BUILD_BUG_ON(sizeof(struct ionic_rxq_desc) != 16);
>> + BUILD_BUG_ON(sizeof(struct ionic_rxq_sg_desc) != 128);
>> + BUILD_BUG_ON(sizeof(struct ionic_rxq_comp) != 16);
> static_assert() for all of those? That way you don't need this fake
> function.
Sure
>
>> +}
>> +
>> +struct ionic_devinfo {
>> + u8 asic_type;
>> + u8 asic_rev;
>> + char fw_version[IONIC_DEVINFO_FWVERS_BUFLEN + 1];
>> + char serial_num[IONIC_DEVINFO_SERIAL_BUFLEN + 1];
>> +};
>> +
>> +struct ionic_dev {
>> + union ionic_dev_info_regs __iomem *dev_info_regs;
>> + union ionic_dev_cmd_regs __iomem *dev_cmd_regs;
>> +
>> + u64 __iomem *db_pages;
>> + dma_addr_t phy_db_pages;
>> +
>> + struct ionic_intr __iomem *intr_ctrl;
>> + u64 __iomem *intr_status;
>> +
>> + struct ionic_devinfo dev_info;
>> +};
>> +
>> +struct ionic;
>> +
>> +void ionic_init_devinfo(struct ionic *ionic);
>> +int ionic_dev_setup(struct ionic *ionic);
>> +void ionic_dev_teardown(struct ionic *ionic);
>> +
>> +void ionic_dev_cmd_go(struct ionic_dev *idev, union ionic_dev_cmd *cmd);
>> +u8 ionic_dev_cmd_status(struct ionic_dev *idev);
>> +bool ionic_dev_cmd_done(struct ionic_dev *idev);
>> +void ionic_dev_cmd_comp(struct ionic_dev *idev, union ionic_dev_cmd_comp *comp);
>> +
>> +void ionic_dev_cmd_identify(struct ionic_dev *idev, u8 ver);
>> +void ionic_dev_cmd_init(struct ionic_dev *idev);
>> +void ionic_dev_cmd_reset(struct ionic_dev *idev);
>> +
>> +#endif /* _IONIC_DEV_H_ */
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_devlink.c b/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
>> index e24ef6971cd5..1ca1e33cca04 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
>> @@ -11,8 +11,28 @@
>> static int ionic_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
>> struct netlink_ext_ack *extack)
>> {
>> + struct ionic *ionic = devlink_priv(dl);
>> + struct ionic_dev *idev = &ionic->idev;
>> + char buf[16];
>> +
>> devlink_info_driver_name_put(req, IONIC_DRV_NAME);
>>
>> + devlink_info_version_running_put(req,
>> + DEVLINK_INFO_VERSION_GENERIC_FW_MGMT,
>> + idev->dev_info.fw_version);
> Are you sure this is not the FW that controls the data path?
There is only one FW rev to report, and this covers mgmt and data.
>
>> + snprintf(buf, sizeof(buf), "0x%x", idev->dev_info.asic_type);
>> + devlink_info_version_fixed_put(req,
>> + DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,
>> + buf);
> Board ID is not ASIC. This is for identifying a board version with all
> its components which surround the main ASIC.
>
>> + snprintf(buf, sizeof(buf), "0x%x", idev->dev_info.asic_rev);
>> + devlink_info_version_fixed_put(req,
>> + DEVLINK_INFO_VERSION_GENERIC_BOARD_REV,
>> + buf);
> ditto
Since I don't have any board info available at this point, shall I use
my own "asic.id" and "asic.rev" strings, or in this patch shall I add
something like this to devlink.h and use them here:
/* Part number, identifier of asic design */
#define DEVLINK_INFO_VERSION_GENERIC_ASIC_ID "asic.id"
/* Revision of asic design */
#define DEVLINK_INFO_VERSION_GENERIC_ASIC_REV "asic.rev"
>> + devlink_info_serial_number_put(req, idev->dev_info.serial_num);
>> +
>> return 0;
>> }
>>
>> @@ -41,3 +61,22 @@ void ionic_devlink_free(struct ionic *ionic)
>> {
>> devlink_free(ionic->dl);
>> }
>> +
>> +int ionic_devlink_register(struct ionic *ionic)
>> +{
>> + int err;
>> +
>> + err = devlink_register(ionic->dl, ionic->dev);
>> + if (err)
>> + dev_warn(ionic->dev, "devlink_register failed: %d\n", err);
>> +
>> + return err;
>> +}
>> +
>> +void ionic_devlink_unregister(struct ionic *ionic)
>> +{
>> + if (!ionic || !ionic->dl)
>> + return;
> Impossiblu
Sure.
Thanks,
sln
^ permalink raw reply
* Re: [PATCH net] tcp: inherit timestamp on mtu probe
From: Eric Dumazet @ 2019-08-27 21:23 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: netdev, David Miller, Jakub Kicinski
In-Reply-To: <CA+FuTSeVKSJHXY_LwJBiVreqm+MUSoJt+Dp3mdATKvB48DUz-g@mail.gmail.com>
On Tue, Aug 27, 2019 at 11:16 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Tue, Aug 27, 2019 at 4:58 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Aug 27, 2019 at 10:54 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> >
> > > Sure, that's more descriptive.
> > >
> > > One caveat, the function is exposed in a header, so it's a
> > > bit more churn. If you don't mind that, I'll send the v2.
> >
> > Oh right it is also used from tcp_shifted_skb() after Martin KaFai Lau fix ...
>
> Leave as is then?
Not a big deal really ;)
Signed-off-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply
* Re: Unable to create htb tc classes more than 64K
From: Dave Taht @ 2019-08-27 21:41 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Akshat Kakkar, Anton Danilov, NetFilter, lartc, netdev,
bloat
In-Reply-To: <48a3284b-e8ba-f169-6a2d-9611f8538f07@gmail.com>
On Tue, Aug 27, 2019 at 2:09 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 8/27/19 10:53 PM, Dave Taht wrote:
> >
> > Although this is very cool, I think in this case the OP is being
> > a router, not server?
>
> This mechanism is generic. EDT has not been designed for servers only.
>
> One HTB class (with one associated qdisc per leaf) per rate limiter
> does not scale, and consumes a _lot_ more memory.
>
> We have abandoned HTB at Google for these reasons.
>
> Nice thing with EDT is that you can stack arbitrary number of rate limiters,
> and still keep a single queue (in FQ or another layer downstream)
There's a lot of nice things about EDT! I'd followed along on the
theory, timerwheels, virtual clocks, etc, and went
seeking ethernet hw that could do it (directly) on the low end and
came up empty - and doing anything with the concept required a
complete rethink on everything we were already doing in
wifi/fq_codel/cake ;(, and after we shipped cake in 4.19, I bought a
sailboat, and logged out for a while.
The biggest problem bufferbloat.net has left is more efficient inbound
shaping/policing on cheap hw.
I don't suppose you've solved that already? :puppy dog eyes:
Next year's version of openwrt we can maybe try to do something
coherent with EDT.
>
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply
* Re: [PATCH v5 net-next 03/18] ionic: Add port management commands
From: Shannon Nelson @ 2019-08-27 21:44 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190826213631.37b8f56d@cakuba.netronome.com>
On 8/26/19 9:36 PM, Jakub Kicinski wrote:
> On Mon, 26 Aug 2019 14:33:24 -0700, Shannon Nelson wrote:
>> The port management commands apply to the physical port
>> associated with the PCI device, which might be shared among
>> several logical interfaces.
>>
>> Signed-off-by: Shannon Nelson <snelson@pensando.io>
[...]
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> index 30a5206bba4e..5b83f21af18a 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> @@ -122,6 +122,10 @@ struct ionic_dev {
>> struct ionic_intr __iomem *intr_ctrl;
>> u64 __iomem *intr_status;
>>
>> + u32 port_info_sz;
>> + struct ionic_port_info *port_info;
>> + dma_addr_t port_info_pa;
>> +
>> struct ionic_devinfo dev_info;
>> };
>>
>> @@ -140,4 +144,15 @@ void ionic_dev_cmd_identify(struct ionic_dev *idev, u8 ver);
>> void ionic_dev_cmd_init(struct ionic_dev *idev);
>> void ionic_dev_cmd_reset(struct ionic_dev *idev);
>>
>> +void ionic_dev_cmd_port_identify(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_init(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_reset(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_state(struct ionic_dev *idev, u8 state);
>> +void ionic_dev_cmd_port_speed(struct ionic_dev *idev, u32 speed);
>> +void ionic_dev_cmd_port_mtu(struct ionic_dev *idev, u32 mtu);
>> +void ionic_dev_cmd_port_autoneg(struct ionic_dev *idev, u8 an_enable);
>> +void ionic_dev_cmd_port_fec(struct ionic_dev *idev, u8 fec_type);
>> +void ionic_dev_cmd_port_pause(struct ionic_dev *idev, u8 pause_type);
>> +void ionic_dev_cmd_port_loopback(struct ionic_dev *idev, u8 loopback_mode);
> I don't think you call most of these functions in this patch.
No, but most get used in the ethtool code added a few patches later.
The port_mtu probably won't get used, so I can pull that out. The
port_loopback will get used when I add a loopback test, but I can pull
that out for now until that test is added.
>
>> #endif /* _IONIC_DEV_H_ */
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> index f52eb6c50358..47928f184230 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> @@ -309,6 +309,92 @@ int ionic_reset(struct ionic *ionic)
>> return err;
>> }
>>
>> +int ionic_port_identify(struct ionic *ionic)
>> +{
>> + struct ionic_identity *ident = &ionic->ident;
>> + struct ionic_dev *idev = &ionic->idev;
>> + size_t sz;
>> + int err;
>> +
>> + mutex_lock(&ionic->dev_cmd_lock);
>> +
>> + ionic_dev_cmd_port_identify(idev);
>> + err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> + if (!err) {
>> + sz = min(sizeof(ident->port), sizeof(idev->dev_cmd_regs->data));
>> + memcpy_fromio(&ident->port, &idev->dev_cmd_regs->data, sz);
>> + }
>> +
>> + mutex_unlock(&ionic->dev_cmd_lock);
>> +
>> + return err;
>> +}
>> +
>> +int ionic_port_init(struct ionic *ionic)
>> +{
>> + struct ionic_identity *ident = &ionic->ident;
>> + struct ionic_dev *idev = &ionic->idev;
>> + size_t sz;
>> + int err;
>> +
>> + if (idev->port_info)
>> + return 0;
>> +
>> + idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE);
>> + idev->port_info = dma_alloc_coherent(ionic->dev, idev->port_info_sz,
>> + &idev->port_info_pa,
>> + GFP_KERNEL);
>> + if (!idev->port_info) {
>> + dev_err(ionic->dev, "Failed to allocate port info, aborting\n");
>> + return -ENOMEM;
>> + }
>> +
>> + sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));
>> +
>> + mutex_lock(&ionic->dev_cmd_lock);
>> +
>> + memcpy_toio(&idev->dev_cmd_regs->data, &ident->port.config, sz);
>> + ionic_dev_cmd_port_init(idev);
>> + err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +
>> + ionic_dev_cmd_port_state(&ionic->idev, IONIC_PORT_ADMIN_STATE_UP);
>> + (void)ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +
>> + mutex_unlock(&ionic->dev_cmd_lock);
>> + if (err) {
>> + dev_err(ionic->dev, "Failed to init port\n");
> The lifetime of port_info seems a little strange. Why is it left in
> place even if the command failed? Doesn't this leak memory?
>
>> + return err;
>> + }
>> +
>> + return 0;
> return err; work for both paths
>
>> +}
>> +
>> +int ionic_port_reset(struct ionic *ionic)
>> +{
>> + struct ionic_dev *idev = &ionic->idev;
>> + int err;
>> +
>> + if (!idev->port_info)
>> + return 0;
>> +
>> + mutex_lock(&ionic->dev_cmd_lock);
>> + ionic_dev_cmd_port_reset(idev);
>> + err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> + mutex_unlock(&ionic->dev_cmd_lock);
>> + if (err) {
>> + dev_err(ionic->dev, "Failed to reset port\n");
>> + return err;
> Again, memory leak if command fails? (nothing frees port_info)
>
>> + }
>> +
>> + dma_free_coherent(ionic->dev, idev->port_info_sz,
>> + idev->port_info, idev->port_info_pa);
>> +
>> + idev->port_info = NULL;
>> + idev->port_info_pa = 0;
>> +
>> + return err;
> Well, with current code err can only be 0 at this point.
I'll revisit these bits.
sln
^ permalink raw reply
* Re: [PATCH] net/mlx5: fix a -Wstringop-truncation warning
From: Saeed Mahameed @ 2019-08-27 21:46 UTC (permalink / raw)
To: cai@lca.pw
Cc: linux-rdma@vger.kernel.org, davem@davemloft.net, Moshe Shemesh,
Feras Daoud, linux-kernel@vger.kernel.org, Eran Ben Elisha,
netdev@vger.kernel.org, leon@kernel.org
In-Reply-To: <1566936733.5576.16.camel@lca.pw>
On Tue, 2019-08-27 at 16:12 -0400, Qian Cai wrote:
> On Mon, 2019-08-26 at 21:11 +0000, Saeed Mahameed wrote:
> > On Fri, 2019-08-23 at 15:56 -0400, Qian Cai wrote:
> > > In file included from ./arch/powerpc/include/asm/paca.h:15,
> > > from ./arch/powerpc/include/asm/current.h:13,
> > > from ./include/linux/thread_info.h:21,
> > > from ./include/asm-generic/preempt.h:5,
> > > from
> > > ./arch/powerpc/include/generated/asm/preempt.h:1,
> > > from ./include/linux/preempt.h:78,
> > > from ./include/linux/spinlock.h:51,
> > > from ./include/linux/wait.h:9,
> > > from ./include/linux/completion.h:12,
> > > from ./include/linux/mlx5/driver.h:37,
> > > from
> > > drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h:6,
> > > from
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:33:
> > > In function 'strncpy',
> > > inlined from 'mlx5_fw_tracer_save_trace' at
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:549:2,
> > > inlined from 'mlx5_tracer_print_trace' at
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:574:2:
> > > ./include/linux/string.h:305:9: warning: '__builtin_strncpy'
> > > output
> > > may
> > > be truncated copying 256 bytes from a string of length 511
> > > [-Wstringop-truncation]
> > > return __builtin_strncpy(p, q, size);
> > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >
> > > Fix it by using the new strscpy_pad() since the commit
> > > 458a3bf82df4
> > > ("lib/string: Add strscpy_pad() function") which will always
> > > NUL-terminate the string, and avoid possibly leak data through
> > > the
> > > ring
> > > buffer where non-admin account might enable these events through
> > > perf.
> > >
> > > Fixes: fd1483fe1f9f ("net/mlx5: Add support for FW reporter
> > > dump")
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> >
> > Hi Qian and thanks for your patch,
> >
> > We already have a patch that handles this issue, please check it
> > out:
> > https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=net-
> > next-mlx5
> >
>
> That commit will make "struct mlx5_fw_tracer" too large and trigger a
> warning in
> __alloc_pages_nodemask(),
>
I see! thanks for the input, the patch is still under review and not
yet passed to regression queue.
I will take your patch.. and will fix our patch on top of yours.
> /*
> * There are several places where we assume that the order
> value is sane
> * so bail out early if the request is out of bound.
> */
> if (unlikely(order >= MAX_ORDER)) {
> WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
> return NULL;
> }
>
> [ 98.339576][ T914] WARNING: CPU: 0 PID: 914 at
> mm/page_alloc.c:4705
> __alloc_pages_nodemask+0x441/0x1bb0
> [ 98.349174][ T914] Modules linked in: smartpqi(+)
> scsi_transport_sas tg3
> mlx5_core(+) libphy firmware_class dm_mirror dm_region_hash dm_log
> dm_mod
> efivarfs
> [ 98.363495][ T914] CPU: 0 PID: 914 Comm: kworker/0:2 Not tainted
> 5.3.0-rc6-
> next-20190827+ #14
> [ 98.372243][ T914] Hardware name: HPE ProLiant DL385
> Gen10/ProLiant DL385
> Gen10, BIOS A40 07/10/2019
> [ 98.381627][ T914] Workqueue: events work_for_cpu_fn
> [ 98.386720][ T914] RIP: 0010:__alloc_pages_nodemask+0x441/0x1bb0
> [ 98.392917][ T914] Code: 17 00 00 48 8d 65 d8 5b 41 5c 41 5d 41
> 5e 41 5f 5d
> c3 89 85 3c fe ff ff bb 01 00 00 00 e9 96 fd ff ff 81 e7 00 20 00 00
> 75 02 <0f>
> 0b 48 c7 85 50 fe ff ff 00 00 00 00 eb 82 31 d2 be 36 12 00 00
> [ 98.412740][ T914] RSP: 0018:ffff88853418f948 EFLAGS: 00010246
> [ 98.418704][ T914] RAX: 0000000000000000 RBX: ffffffff9571a860
> RCX:
> 1ffff110a6831f3e
> [ 98.426652][ T914] RDX: 0000000000000000 RSI: 000000000000000b
> RDI:
> 0000000000000000
> [ 98.434661][ T914] RBP: ffff88853418fb58 R08: ffffed1108808465
> R09:
> ffffed1108808465
> [ 98.442613][ T914] R10: ffffed1108808464 R11: ffff888844042323
> R12:
> 0000000000000000
> [ 98.450548][ T914] R13: 000000000000000b R14: 0000000000000000
> R15:
> 0000000000000001
> [ 98.458434][ T914] FS: 0000000000000000(0000)
> GS:ffff888844000000(0000)
> knlGS:0000000000000000
> [ 98.467350][ T914] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> [ 98.473911][ T914] CR2: 0000555c64680148 CR3: 0000000550412000
> CR4:
> 00000000003406b0
> [ 98.481838][ T914] Call Trace:
> [ 98.485011][ T914] ? find_next_bit+0x2c/0xa0
> [ 98.489490][ T914] ? __kasan_check_write+0x14/0x20
> [ 98.494506][ T914] ? graph_lock+0xb8/0x120
> [ 98.498811][ T914] ? __free_zapped_classes+0x740/0x740
> [ 98.504239][ T914] ? gfp_pfmemalloc_allowed+0xc0/0xc0
> [ 98.509504][ T914] ? __kasan_check_read+0x11/0x20
> [ 98.514443][ T914] ? register_lock_class+0x5ef/0x960
> [ 98.519624][ T914] ? rcu_read_lock_sched_held+0xac/0xe0
> [ 98.525152][ T914] ? rcu_read_lock_any_held.part.5+0x20/0x20
> [ 98.531130][ T914] ? find_next_bit+0x2c/0xa0
> [ 98.535610][ T914] alloc_pages_current+0x9c/0x110
> [ 98.540638][ T914] kmalloc_order+0x22/0x70
> [ 98.544943][ T914] kmalloc_order_trace+0x23/0x100
> [ 98.550072][ T914] mlx5_fw_tracer_create+0x51/0x870 [mlx5_core]
> [ 98.556213][ T914] ? __mutex_init+0x94/0xa0
> [ 98.560744][ T914] ? mlx5_init_rl_table+0x144/0x210 [mlx5_core]
> [ 98.566929][ T914] mlx5_load_one+0x199/0x980 [mlx5_core]
> [ 98.572637][ T914] init_one+0x494/0x760 [mlx5_core]
> [ 98.577771][ T914] ? mlx5_pci_resume+0xd0/0xd0 [mlx5_core]
> [ 98.583574][ T914] local_pci_probe+0x7a/0xc0
> [ 98.588054][ T914] ? pci_dma_configure+0xa0/0xa0
> [ 98.592938][ T914] work_for_cpu_fn+0x2e/0x50
> [ 98.597416][ T914] process_one_work+0x53b/0xa70
> [ 98.602220][ T914] ? pwq_dec_nr_in_flight+0x170/0x170
> [ 98.607485][ T914] ? move_linked_works+0x113/0x150
> [ 98.612497][ T914] worker_thread+0x363/0x5b0
> [ 98.616976][ T914] kthread+0x1df/0x200
> [ 98.620932][ T914] ? process_one_work+0xa70/0xa70
> [ 98.625847][ T914] ? kthread_park+0xd0/0xd0
> [ 98.630240][ T914] ret_from_fork+0x22/0x40
^ permalink raw reply
* Re: [PATCH net-next v2 3/3] dt-bindings: net: ethernet: Update mt7622 docs and dts to reflect the new phylink API
From: Rob Herring @ 2019-08-27 21:51 UTC (permalink / raw)
To: René van Dorst
Cc: John Crispin, Sean Wang, Nelson Chang, David S . Miller,
Matthias Brugger, netdev, linux-arm-kernel, linux-mediatek,
linux-mips, Frank Wunderlich, Stefan Roese, devicetree
In-Reply-To: <20190821144336.9259-4-opensource@vdorst.com>
On Wed, Aug 21, 2019 at 04:43:36PM +0200, René van Dorst wrote:
> This patch the removes the recently added mediatek,physpeed property.
> Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
> See mt7622-bananapi-bpi-r64.dts for a working example.
>
> Signed-off-by: René van Dorst <opensource@vdorst.com>
> Cc: devicetree@vger.kernel.org
> Cc: Rob Herring <robh@kernel.org>
> --
> v1->v2:
> * SGMII port only support BASE-X at 2.5Gbit.
> ---
> .../arm/mediatek/mediatek,sgmiisys.txt | 2 --
Bindings and dts files should be separate patches.
> .../dts/mediatek/mt7622-bananapi-bpi-r64.dts | 28 +++++++++++++------
> arch/arm64/boot/dts/mediatek/mt7622.dtsi | 1 -
> 3 files changed, 19 insertions(+), 12 deletions(-)
In any case,
Acked-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* Re: [PATCH] net/mlx5: fix a -Wstringop-truncation warning
From: Saeed Mahameed @ 2019-08-27 21:51 UTC (permalink / raw)
To: cai@lca.pw
Cc: linux-rdma@vger.kernel.org, davem@davemloft.net, Moshe Shemesh,
Feras Daoud, linux-kernel@vger.kernel.org, Eran Ben Elisha,
netdev@vger.kernel.org, leon@kernel.org
In-Reply-To: <1566590183-9898-1-git-send-email-cai@lca.pw>
On Fri, 2019-08-23 at 15:56 -0400, Qian Cai wrote:
> In file included from ./arch/powerpc/include/asm/paca.h:15,
> from ./arch/powerpc/include/asm/current.h:13,
> from ./include/linux/thread_info.h:21,
> from ./include/asm-generic/preempt.h:5,
> from
> ./arch/powerpc/include/generated/asm/preempt.h:1,
> from ./include/linux/preempt.h:78,
> from ./include/linux/spinlock.h:51,
> from ./include/linux/wait.h:9,
> from ./include/linux/completion.h:12,
> from ./include/linux/mlx5/driver.h:37,
> from
> drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h:6,
> from
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:33:
> In function 'strncpy',
> inlined from 'mlx5_fw_tracer_save_trace' at
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:549:2,
> inlined from 'mlx5_tracer_print_trace' at
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:574:2:
> ./include/linux/string.h:305:9: warning: '__builtin_strncpy' output
> may
> be truncated copying 256 bytes from a string of length 511
> [-Wstringop-truncation]
> return __builtin_strncpy(p, q, size);
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Fix it by using the new strscpy_pad() since the commit 458a3bf82df4
> ("lib/string: Add strscpy_pad() function") which will always
> NUL-terminate the string, and avoid possibly leak data through the
> ring
> buffer where non-admin account might enable these events through
> perf.
>
> Fixes: fd1483fe1f9f ("net/mlx5: Add support for FW reporter dump")
> Signed-off-by: Qian Cai <cai@lca.pw>
>
Applied to mlx5-next, Thanks !
^ permalink raw reply
* Re: [PATCH v2 net] Add genphy_c45_config_aneg() function to phy-c45.c
From: David Miller @ 2019-08-27 22:01 UTC (permalink / raw)
To: marco.hartmann
Cc: andrew, f.fainelli, hkallweit1, netdev, linux-kernel,
christian.herber
In-Reply-To: <1566385208-23523-1-git-send-email-marco.hartmann@nxp.com>
From: Marco Hartmann <marco.hartmann@nxp.com>
Date: Wed, 21 Aug 2019 11:00:46 +0000
> Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
> genphy_config_aneg") introduced a check that aborts phy_config_aneg()
> if the phy is a C45 phy.
> This causes phy_state_machine() to call phy_error() so that the phy
> ends up in PHY_HALTED state.
>
> Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
> (analogous to the C22 case) so that the state machine can run
> correctly.
>
> genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
> in drivers/net/phy/marvell10g.c, excluding vendor specific
> configurations for 1000BaseT.
>
> Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
>
> Signed-off-by: Marco Hartmann <marco.hartmann@nxp.com>
Andrew, gentle ping to respond to Heiner who said:
> For me this patch would be ok, even though this generic config_aneg
> doesn't support 1000BaseT.
> 1. The whole genphy_c45 driver doesn't make sense w/o a config_aneg
> callback implementation.
> 2. It can serve as a temporary fallback for new C45 PHY's that don't
> have a dedicated driver yet.
> 3. We may have C45 PHYs not supporting 1000BaseT (e.g. T1).
>
> Andrew?
Thanks.
^ permalink raw reply
* Re: [PATCH net-next v5] sched: Add dualpi2 qdisc
From: David Miller @ 2019-08-27 22:03 UTC (permalink / raw)
To: olivier.tilmans
Cc: eric.dumazet, stephen, olga, koen.de_schepper, research, henrist,
jhs, xiyou.wangcong, jiri, linux-kernel, netdev
In-Reply-To: <20190822080045.27609-1-olivier.tilmans@nokia-bell-labs.com>
From: "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>
Date: Thu, 22 Aug 2019 08:10:48 +0000
> +static inline struct dualpi2_skb_cb *dualpi2_skb_cb(struct sk_buff *skb)
Please do not use the inline keyword in foo.c files, let the compiler decide.
> +static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch)
> +{
> + struct dualpi2_sched_data *q = qdisc_priv(sch);
> + struct sk_buff *skb;
> + int qlen_c, credit_change;
Reverse christmas tree here, please.
> +static void dualpi2_timer(struct timer_list *timer)
> +{
> + struct dualpi2_sched_data *q = from_timer(q, timer, pi2.timer);
> + struct Qdisc *sch = q->sch;
> + spinlock_t *root_lock; /* Lock to access the head of both queues. */
Likewise, and please remove this comment it makes the variable declarations
look odd.
^ permalink raw reply
* Re: [PATCH v2] bonding: force enable lacp port after link state recovery for 802.3ad
From: David Miller @ 2019-08-27 22:04 UTC (permalink / raw)
To: zhangsha.zhang
Cc: j.vosburgh, vfalico, andy, netdev, linux-kernel, yuehaibing,
hunongda, alex.chen
In-Reply-To: <20190823034209.14596-1-zhangsha.zhang@huawei.com>
From: <zhangsha.zhang@huawei.com>
Date: Fri, 23 Aug 2019 11:42:09 +0800
> - If speed/duplex getting failed here, the link status
> will be changed to BOND_LINK_FAIL;
How does it fail at this step? I suspect this is a driver specific
problem.
^ permalink raw reply
* Re: [Patch net] net_sched: fix a NULL pointer deref in ipt action
From: David Miller @ 2019-08-27 22:06 UTC (permalink / raw)
To: xiyou.wangcong; +Cc: netdev, itugrok, jhs, jiri
In-Reply-To: <20190825170132.31174-1-xiyou.wangcong@gmail.com>
From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun, 25 Aug 2019 10:01:32 -0700
> The net pointer in struct xt_tgdtor_param is not explicitly
> initialized therefore is still NULL when dereferencing it.
> So we have to find a way to pass the correct net pointer to
> ipt_destroy_target().
>
> The best way I find is just saving the net pointer inside the per
> netns struct tcf_idrinfo, which could make this patch smaller.
>
> Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
> Reported-and-tested-by: itugrok@yahoo.com
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH net-next v2 2/3] dt-bindings: net: dsa: mt7530: Add support for port 5
From: Rob Herring @ 2019-08-27 22:22 UTC (permalink / raw)
To: René van Dorst
Cc: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
David S . Miller, Matthias Brugger, netdev, linux-arm-kernel,
linux-mediatek, John Crispin, linux-mips, Frank Wunderlich,
devicetree
In-Reply-To: <20190821144547.15113-3-opensource@vdorst.com>
On Wed, Aug 21, 2019 at 04:45:46PM +0200, René van Dorst wrote:
> MT7530 port 5 has many modes/configurations.
> Update the documentation how to use port 5.
>
> Signed-off-by: René van Dorst <opensource@vdorst.com>
> Cc: devicetree@vger.kernel.org
> Cc: Rob Herring <robh@kernel.org>
> v1->v2:
> * Adding extra note about RGMII2 and gpio use.
> rfc->v1:
> * No change
The changelog goes below the '---'
> ---
> .../devicetree/bindings/net/dsa/mt7530.txt | 218 ++++++++++++++++++
> 1 file changed, 218 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> index 47aa205ee0bd..43993aae3f9c 100644
> --- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> @@ -35,6 +35,42 @@ Required properties for the child nodes within ports container:
> - phy-mode: String, must be either "trgmii" or "rgmii" for port labeled
> "cpu".
>
> +Port 5 of the switch is muxed between:
> +1. GMAC5: GMAC5 can interface with another external MAC or PHY.
> +2. PHY of port 0 or port 4: PHY interfaces with an external MAC like 2nd GMAC
> + of the SOC. Used in many setups where port 0/4 becomes the WAN port.
> + Note: On a MT7621 SOC with integrated switch: 2nd GMAC can only connected to
> + GMAC5 when the gpios for RGMII2 (GPIO 22-33) are not used and not
> + connected to external component!
> +
> +Port 5 modes/configurations:
> +1. Port 5 is disabled and isolated: An external phy can interface to the 2nd
> + GMAC of the SOC.
> + In the case of a build-in MT7530 switch, port 5 shares the RGMII bus with 2nd
> + GMAC and an optional external phy. Mind the GPIO/pinctl settings of the SOC!
> +2. Port 5 is muxed to PHY of port 0/4: Port 0/4 interfaces with 2nd GMAC.
> + It is a simple MAC to PHY interface, port 5 needs to be setup for xMII mode
> + and RGMII delay.
> +3. Port 5 is muxed to GMAC5 and can interface to an external phy.
> + Port 5 becomes an extra switch port.
> + Only works on platform where external phy TX<->RX lines are swapped.
> + Like in the Ubiquiti ER-X-SFP.
> +4. Port 5 is muxed to GMAC5 and interfaces with the 2nd GAMC as 2nd CPU port.
> + Currently a 2nd CPU port is not supported by DSA code.
> +
> +Depending on how the external PHY is wired:
> +1. normal: The PHY can only connect to 2nd GMAC but not to the switch
> +2. swapped: RGMII TX, RX are swapped; external phy interface with the switch as
> + a ethernet port. But can't interface to the 2nd GMAC.
> +
> +Based on the DT the port 5 mode is configured.
> +
> +Driver tries to lookup the phy-handle of the 2nd GMAC of the master device.
> +When phy-handle matches PHY of port 0 or 4 then port 5 set-up as mode 2.
> +phy-mode must be set, see also example 2 below!
> + * mt7621: phy-mode = "rgmii-txid";
> + * mt7623: phy-mode = "rgmii";
> +
> See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
> required, optional properties and how the integrated switch subnodes must
> be specified.
> @@ -94,3 +130,185 @@ Example:
> };
> };
> };
> +
> +Example 2: MT7621: Port 4 is WAN port: 2nd GMAC -> Port 5 -> PHY port 4.
> +
> +ð {
> + status = "okay";
Don't show status in examples.
This should show the complete node.
> +
> + gmac0: mac@0 {
> + compatible = "mediatek,eth-mac";
> + reg = <0>;
> + phy-mode = "rgmii";
> +
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + pause;
> + };
> + };
> +
> + gmac1: mac@1 {
> + compatible = "mediatek,eth-mac";
> + reg = <1>;
> + phy-mode = "rgmii-txid";
> + phy-handle = <&phy4>;
> + };
> +
> + mdio: mdio-bus {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + /* Internal phy */
> + phy4: ethernet-phy@4 {
> + reg = <4>;
> + };
> +
> + mt7530: switch@1f {
> + compatible = "mediatek,mt7621";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x1f>;
> + pinctrl-names = "default";
> + mediatek,mcm;
> +
> + resets = <&rstctrl 2>;
> + reset-names = "mcm";
> +
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@0 {
> + reg = <0>;
> + label = "lan0";
> + };
> +
> + port@1 {
> + reg = <1>;
> + label = "lan1";
> + };
> +
> + port@2 {
> + reg = <2>;
> + label = "lan2";
> + };
> +
> + port@3 {
> + reg = <3>;
> + label = "lan3";
> + };
> +
> +/* Commented out. Port 4 is handled by 2nd GMAC.
> + port@4 {
> + reg = <4>;
> + label = "lan4";
> + };
> +*/
> +
> + cpu_port0: port@6 {
> + reg = <6>;
> + label = "cpu";
> + ethernet = <&gmac0>;
> + phy-mode = "rgmii";
> +
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + pause;
> + };
> + };
> + };
> + };
> + };
> +};
> +
> +Example 3: MT7621: Port 5 is connected to external PHY: Port 5 -> external PHY.
> +
> +ð {
> + status = "okay";
> +
> + gmac0: mac@0 {
> + compatible = "mediatek,eth-mac";
> + reg = <0>;
> + phy-mode = "rgmii";
> +
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + pause;
> + };
> + };
> +
> + mdio: mdio-bus {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + /* External phy */
> + ephy5: ethernet-phy@7 {
> + reg = <7>;
> + };
> +
> + mt7530: switch@1f {
> + compatible = "mediatek,mt7621";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x1f>;
> + pinctrl-names = "default";
> + mediatek,mcm;
> +
> + resets = <&rstctrl 2>;
> + reset-names = "mcm";
> +
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@0 {
> + reg = <0>;
> + label = "lan0";
> + };
> +
> + port@1 {
> + reg = <1>;
> + label = "lan1";
> + };
> +
> + port@2 {
> + reg = <2>;
> + label = "lan2";
> + };
> +
> + port@3 {
> + reg = <3>;
> + label = "lan3";
> + };
> +
> + port@4 {
> + reg = <4>;
> + label = "lan4";
> + };
> +
> + port@5 {
> + reg = <5>;
> + label = "lan5";
> + phy-mode = "rgmii";
> + phy-handle = <&ephy5>;
> + };
> +
> + cpu_port0: port@6 {
> + reg = <6>;
> + label = "cpu";
> + ethernet = <&gmac0>;
> + phy-mode = "rgmii";
> +
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + pause;
> + };
> + };
> + };
> + };
> + };
> +};
> --
> 2.20.1
>
^ permalink raw reply
* Re: libbpf distro packaging
From: Julia Kartseva @ 2019-08-27 22:30 UTC (permalink / raw)
To: Jiri Olsa, Alexei Starovoitov
Cc: Andrii Nakryiko, labbott@redhat.com, acme@kernel.org,
debian-kernel@lists.debian.org, netdev@vger.kernel.org,
Andrey Ignatov, Yonghong Song, jolsa@kernel.org, Daniel Borkmann
In-Reply-To: <20190826064235.GA17554@krava>
On 8/25/19, 11:42 PM, "Jiri Olsa" <jolsa@redhat.com> wrote:
> On Fri, Aug 23, 2019 at 04:00:01PM +0000, Alexei Starovoitov wrote:
> >
> > Technically we can bump it at any time.
> > The goal was to bump it only when new kernel is released
> > to capture a collection of new APIs in a given 0.0.X release.
> > So that libbpf versions are synchronized with kernel versions
> > in some what loose way.
> > In this case we can make an exception and bump it now.
>
> I see, I dont think it's worth of the exception now,
> the patch is simple or we'll start with 0.0.3
PR introducing 0.0.5 ABI was merged:
https://github.com/libbpf/libbpf/commit/476e158
Jiri, you'd like to avoid patching, you can start w/ 0.0.5.
Also if you're planning to use *.spec from libbpf as a source of truth,
It may be enhanced by syncing spec and ABI versions, similar to
https://github.com/libbpf/libbpf/commit/d60f568
^ permalink raw reply
* Re: [PATCH bpf-next 0/4] bpf: precision tracking tests
From: Daniel Borkmann @ 2019-08-27 22:43 UTC (permalink / raw)
To: Alexei Starovoitov, davem; +Cc: netdev, bpf, kernel-team
In-Reply-To: <20190823055215.2658669-1-ast@kernel.org>
On 8/23/19 7:52 AM, Alexei Starovoitov wrote:
> Add few additional tests for precision tracking in the verifier.
>
> Alexei Starovoitov (4):
> bpf: introduce verifier internal test flag
> tools/bpf: sync bpf.h
> selftests/bpf: verifier precise tests
> selftests/bpf: add precision tracking test
>
> include/linux/bpf_verifier.h | 1 +
> include/uapi/linux/bpf.h | 3 +
> kernel/bpf/syscall.c | 1 +
> kernel/bpf/verifier.c | 5 +-
> tools/include/uapi/linux/bpf.h | 3 +
> tools/testing/selftests/bpf/test_verifier.c | 68 +++++++--
> .../testing/selftests/bpf/verifier/precise.c | 142 ++++++++++++++++++
> 7 files changed, 211 insertions(+), 12 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/verifier/precise.c
>
Applied, thanks!
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox