Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC PATCH] kcm: hold rx mux lock when updating the receive queue.
From: David Miller @ 2018-06-05 14:53 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, tom, ktkhai
In-Reply-To: <fa80bc9f24e40e1a7a7fa1452330b7f0b7d6e1fe.1528194606.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Tue,  5 Jun 2018 12:32:33 +0200

> @@ -1157,7 +1158,9 @@ static int kcm_recvmsg(struct socket *sock, struct msghdr *msg,
>  			/* Finished with message */
>  			msg->msg_flags |= MSG_EOR;
>  			KCM_STATS_INCR(kcm->stats.rx_msgs);
> +			spin_lock_bh(&kcm->mux->rx_lock);
>  			skb_unlink(skb, &sk->sk_receive_queue);
> +			spin_unlock_bh(&kcm->mux->rx_lock);

Hmmm, maybe I don't understand the corruption.

But, skb_unlink() takes the sk->sk_receive_queue.lock which should
prevent SKB list corruption.

^ permalink raw reply

* [PATCH bpf-next v4 2/2] samples/bpf: Add xdp_sample_pkts example
From: Toke Høiland-Jørgensen @ 2018-06-05 14:50 UTC (permalink / raw)
  To: netdev
In-Reply-To: <152821020087.23694.8231039605257373797.stgit@alrua-kau>

Add an example program showing how to sample packets from XDP using the
perf event buffer. The example userspace program just prints the ethernet
header for every packet sampled.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 samples/bpf/Makefile               |    4 +
 samples/bpf/xdp_sample_pkts_kern.c |   62 +++++++++++++
 samples/bpf/xdp_sample_pkts_user.c |  176 ++++++++++++++++++++++++++++++++++++
 3 files changed, 242 insertions(+)
 create mode 100644 samples/bpf/xdp_sample_pkts_kern.c
 create mode 100644 samples/bpf/xdp_sample_pkts_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1303af10e54d..9ea2f7b64869 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -52,6 +52,7 @@ hostprogs-y += xdp_adjust_tail
 hostprogs-y += xdpsock
 hostprogs-y += xdp_fwd
 hostprogs-y += task_fd_query
+hostprogs-y += xdp_sample_pkts
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -107,6 +108,7 @@ xdp_adjust_tail-objs := xdp_adjust_tail_user.o
 xdpsock-objs := bpf_load.o xdpsock_user.o
 xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
 task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
+xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -163,6 +165,7 @@ always += xdp_adjust_tail_kern.o
 always += xdpsock_kern.o
 always += xdp_fwd_kern.o
 always += task_fd_query_kern.o
+always += xdp_sample_pkts_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -179,6 +182,7 @@ HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_xdp_sample_pkts_user.o += -I$(srctree)/tools/lib/bpf/
 
 HOST_LOADLIBES		+= $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4		+= -lrt
diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
new file mode 100644
index 000000000000..4560522ca015
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -0,0 +1,62 @@
+#include <linux/ptrace.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+#define SAMPLE_SIZE 64ul
+#define MAX_CPUS 24
+
+#define bpf_printk(fmt, ...)					\
+({								\
+	       char ____fmt[] = fmt;				\
+	       bpf_trace_printk(____fmt, sizeof(____fmt),	\
+				##__VA_ARGS__);			\
+})
+
+struct bpf_map_def SEC("maps") my_map = {
+	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(u32),
+	.max_entries = MAX_CPUS,
+};
+
+SEC("xdp_sample")
+int xdp_sample_prog(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+
+        /* Metadata will be in the perf event before the packet data. */
+	struct S {
+		u16 cookie;
+		u16 pkt_len;
+	} __attribute__((packed)) metadata;
+
+	if (data + SAMPLE_SIZE < data_end) {
+		/* The XDP perf_event_output handler will use the upper 32 bits
+		 * of the flags argument as a number of bytes to include of the
+		 * packet payload in the event data. If the size is too big, the
+		 * call to bpf_perf_event_output will fail and return -EFAULT.
+		 *
+		 * See bpf_xdp_event_output in net/core/filter.c.
+		 *
+		 * The BPF_F_CURRENT_CPU flag means that the event output fd
+		 * will be indexed by the CPU number in the event map.
+		 */
+		u64 flags = (SAMPLE_SIZE << 32) | BPF_F_CURRENT_CPU;
+		int ret;
+
+		metadata.cookie = 0xdead;
+		metadata.pkt_len = (u16)(data_end - data);
+
+		ret = bpf_perf_event_output(ctx, &my_map, flags,
+				      &metadata, sizeof(metadata));
+		if(ret)
+			bpf_printk("perf_event_output failed: %d\n", ret);
+	}
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
new file mode 100644
index 000000000000..672392d48ce3
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -0,0 +1,176 @@
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <stdio.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <linux/perf_event.h>
+#include <linux/bpf.h>
+#include <net/if.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/sysinfo.h>
+#include <sys/syscall.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <signal.h>
+#include <libbpf.h>
+#include <bpf/bpf.h>
+
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+#define MAX_CPUS 24
+static int pmu_fds[MAX_CPUS], if_idx = 0;
+static struct perf_event_mmap_page *headers[MAX_CPUS];
+static char *if_name;
+
+static int do_attach(int idx, int fd, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, fd, 0);
+	if (err < 0)
+		printf("ERROR: failed to attach program to %s\n", name);
+
+	return err;
+}
+
+static int do_detach(int idx, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, -1, 0);
+	if (err < 0)
+		printf("ERROR: failed to detach program from %s\n", name);
+
+	return err;
+}
+
+#define SAMPLE_SIZE 64
+
+static int print_bpf_output(void *data, int size)
+{
+	struct {
+		__u16 cookie;
+		__u16 pkt_len;
+		__u8  pkt_data[SAMPLE_SIZE];
+	} __attribute__((packed)) *e = data;
+	int i;
+
+	if (e->cookie != 0xdead) {
+		printf("BUG cookie %x sized %d\n",
+		       e->cookie, size);
+		return LIBBPF_PERF_EVENT_ERROR;
+	}
+
+	printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+	for (i = 0; i < 14 && i < e->pkt_len; i++)
+		printf("%02x ", e->pkt_data[i]);
+	printf("\n");
+
+	return LIBBPF_PERF_EVENT_CONT;
+}
+
+static void test_bpf_perf_event(int map_fd, int num)
+{
+	struct perf_event_attr attr = {
+		.sample_type = PERF_SAMPLE_RAW,
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_BPF_OUTPUT,
+		.wakeup_events = 1, /* get an fd notification for every event */
+	};
+	int i;
+
+	for (i = 0; i < num; i++) {
+		int key = i;
+
+		pmu_fds[i] = sys_perf_event_open(&attr, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
+
+		assert(pmu_fds[i] >= 0);
+		assert(bpf_map_update_elem(map_fd, &key, &pmu_fds[i], BPF_ANY) == 0);
+		ioctl(pmu_fds[i], PERF_EVENT_IOC_ENABLE, 0);
+	}
+}
+
+static void sig_handler(int signo)
+{
+	do_detach(if_idx, if_name);
+	exit(0);
+}
+
+int main(int argc, char **argv)
+{
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type	= BPF_PROG_TYPE_XDP,
+	};
+	struct bpf_object *obj;
+	struct bpf_map *map;
+	int prog_fd, map_fd;
+	char filename[256];
+	int ret, err, i;
+	int numcpus;
+
+	if (argc < 2) {
+		printf("Usage: %s <ifname>\n", argv[0]);
+		return 1;
+	}
+
+	numcpus = get_nprocs();
+	if (numcpus > MAX_CPUS)
+		numcpus = MAX_CPUS;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	prog_load_attr.file = filename;
+
+	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+		return 1;
+
+	if (!prog_fd) {
+		printf("load_bpf_file: %s\n", strerror(errno));
+		return 1;
+	}
+
+	map = bpf_map__next(NULL, obj);
+	if (!map) {
+		printf("finding a map in obj file failed\n");
+		return 1;
+	}
+	map_fd = bpf_map__fd(map);
+
+	if_idx = if_nametoindex(argv[1]);
+	if (!if_idx)
+		if_idx = strtoul(argv[1], NULL, 0);
+
+	if (!if_idx) {
+		fprintf(stderr, "Invalid ifname\n");
+		return 1;
+	}
+	if_name = argv[1];
+	err = do_attach(if_idx, prog_fd, argv[1]);
+	if (err)
+		return err;
+
+	if (signal(SIGINT, sig_handler) ||
+	    signal(SIGHUP, sig_handler) ||
+	    signal(SIGTERM, sig_handler)) {
+		perror("signal");
+		return 1;
+	}
+
+	test_bpf_perf_event(map_fd, numcpus);
+
+	for (i = 0; i < numcpus; i++)
+		if (perf_event_mmap_header(pmu_fds[i], &headers[i]) < 0)
+			return 1;
+
+	ret = perf_event_poller_multi(pmu_fds, headers, numcpus, print_bpf_output);
+	kill(0, SIGINT);
+	return ret;
+}

^ permalink raw reply related

* [PATCH bpf-next v4 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-05 14:50 UTC (permalink / raw)
  To: netdev

Add two new helper functions to trace_helpers that supports polling
multiple perf file descriptors for events. These are used to the XDP
perf_event_output example, which needs to work with one perf fd per CPU.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 tools/testing/selftests/bpf/trace_helpers.c |   47 ++++++++++++++++++++++++++-
 tools/testing/selftests/bpf/trace_helpers.h |    4 ++
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 3868dcb63420..1e62d89f34cf 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -88,7 +88,7 @@ static int page_size;
 static int page_cnt = 8;
 static struct perf_event_mmap_page *header;
 
-int perf_event_mmap(int fd)
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header)
 {
 	void *base;
 	int mmap_size;
@@ -102,10 +102,15 @@ int perf_event_mmap(int fd)
 		return -1;
 	}
 
-	header = base;
+	*header = base;
 	return 0;
 }
 
+int perf_event_mmap(int fd)
+{
+	return perf_event_mmap_header(fd, &header);
+}
+
 static int perf_event_poll(int fd)
 {
 	struct pollfd pfd = { .fd = fd, .events = POLLIN };
@@ -163,3 +168,41 @@ int perf_event_poller(int fd, perf_event_print_fn output_fn)
 
 	return ret;
 }
+
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn)
+{
+	enum bpf_perf_event_ret ret;
+	struct pollfd *pfds;
+	void *buf = NULL;
+	size_t len = 0;
+	int i;
+
+	pfds = malloc(sizeof(*pfds) * num_fds);
+	if (!pfds)
+		return -1;
+
+	memset(pfds, 0, sizeof(*pfds) * num_fds);
+	for (i = 0; i < num_fds; i++) {
+		pfds[i].fd = fds[i];
+		pfds[i].events = POLLIN;
+	}
+
+	for (;;) {
+		poll(pfds, num_fds, 1000);
+		for (i = 0; i < num_fds; i++) {
+			if (pfds[i].revents) {
+				ret = bpf_perf_event_read_simple(headers[i], page_cnt * page_size,
+								page_size, &buf, &len,
+								bpf_perf_event_print,
+								output_fn);
+				if (ret != LIBBPF_PERF_EVENT_CONT)
+					break;
+			}
+		}
+	}
+	free(buf);
+	free(pfds);
+
+	return ret;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 3b4bcf7f5084..18924f23db1b 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -3,6 +3,7 @@
 #define __TRACE_HELPER_H
 
 #include <libbpf.h>
+#include <linux/perf_event.h>
 
 struct ksym {
 	long addr;
@@ -16,6 +17,9 @@ long ksym_get_addr(const char *name);
 typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
 
 int perf_event_mmap(int fd);
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header);
 /* return LIBBPF_PERF_EVENT_DONE or LIBBPF_PERF_EVENT_ERROR */
 int perf_event_poller(int fd, perf_event_print_fn output_fn);
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn);
 #endif

^ permalink raw reply related

* Re: [PATCH net-next] qed*: Utilize FW 8.37.2.0
From: David Miller @ 2018-06-05 14:48 UTC (permalink / raw)
  To: Michal.Kalderon
  Cc: netdev, linux-rdma, linux-scsi, Ariel.Elior, manish.rangankar
In-Reply-To: <20180605101116.30292-1-Michal.Kalderon@cavium.com>

From: Michal Kalderon <Michal.Kalderon@cavium.com>
Date: Tue, 5 Jun 2018 13:11:16 +0300

> This FW contains several fixes and features.
> 
> RDMA
> - Several modifications and fixes for Memory Windows
> - drop vlan and tcp timestamp from mss calculation in driver for
>   this FW
> - Fix SQ completion flow when local ack timeout is infinite
> - Modifications in t10dif support
> 
> ETH
> - Fix aRFS for tunneled traffic without inner IP.
> - Fix chip configuration which may fail under heavy traffic conditions.
> - Support receiving any-VNI in VXLAN and GENEVE RX classification.
> 
> iSCSI / FcoE
> - Fix iSCSI recovery flow
> - Drop vlan and tcp timestamp from mss calc for fw 8.37.2.0
> 
> Misc
> - Several registers (split registers) won't read correctly with
>   ethtool -d
> 
> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
> Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] net-tcp: remove useless tw_timeout field
From: David Miller @ 2018-06-05 14:45 UTC (permalink / raw)
  To: eric.dumazet; +Cc: zenczykowski, maze, edumazet, netdev
In-Reply-To: <ed973e26-eb29-362f-fc86-eb0955b978ad@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 5 Jun 2018 05:59:27 -0700

> On 06/05/2018 03:07 AM, Maciej Żenczykowski wrote:
>> From: Maciej Żenczykowski <maze@google.com>
>> 
>> Tested: 'git grep tw_timeout' comes up empty and it builds :-)
>> 
>> Signed-off-by: Maciej Żenczykowski <maze@google.com>
>> Cc: Eric Dumazet <edumazet@google.com>
> 
> This field became no longer needed when tcp_tw_recycle was removed in linux-4.12
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-05 14:44 UTC (permalink / raw)
  To: Daniel Borkmann, netdev
In-Reply-To: <d23ef7fd-27d4-2f9b-b5cd-cb0048514c4a@iogearbox.net>

Daniel Borkmann <daniel@iogearbox.net> writes:

> Hi Toke,
>
> On 06/05/2018 01:14 PM, Toke Høiland-Jørgensen wrote:
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>
> Please no empty commit message. Not sure why from the previous patch
> you removed it here.

Ah, right, sorry; think I got patch versions mixed up :/

Will resend

-Toke

^ permalink raw reply

* Re: [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: Eckl, Máté @ 2018-06-05 14:42 UTC (permalink / raw)
  To: arnd
  Cc: pablo, davem, kuznet, yoshfuji, pabeni, willemb, edumazet,
	dsahern, kafai, netdev, linux-kernel
In-Reply-To: <20180605114056.1239571-1-arnd@arndb.de>

Arnd Bergmann <arnd@arndb.de> ezt írta (időpont: 2018. jún. 5., K, 13:41):
>
> It is now possible to enable the libified nf_tproxy modules without
> also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
> ifdef logic in the udp core code:
>
> net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
> nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
> net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
> nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'
>
> We can actually simplify the conditions now to provide the two functions
> exactly when they are needed.
>
> Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Máté Eckl <ecklm94@gmail.com>

^ permalink raw reply

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: David Miller @ 2018-06-05 14:30 UTC (permalink / raw)
  To: paulb
  Cc: jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark, markb,
	ogerlitz
In-Reply-To: <1528185843-18645-1-git-send-email-paulb@mellanox.com>

From: Paul Blakey <paulb@mellanox.com>
Date: Tue,  5 Jun 2018 11:04:03 +0300

> When using a vxlan device as the ingress dev, we count it as a
> "no offload dev", so when such a rule comes and err stop is true,
> we fail early and don't try the egdev route which can offload it
> through the egress device.
> 
> Fix that by not calling the block offload if one of the devices
> attached to it is not offload capable, but make sure egress on such case
> is capable instead.
> 
> Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures?
From: David Miller @ 2018-06-05 14:24 UTC (permalink / raw)
  To: jiri; +Cc: idosch, dsahern, netdev, idosch, jiri, jakub.kicinski, dsahern
In-Reply-To: <20180605081836.GD2164@nanopsycho>

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 5 Jun 2018 10:18:36 +0200

> Tue, Jun 05, 2018 at 10:05:28AM CEST, idosch@idosch.org wrote:
>>On Tue, Jun 05, 2018 at 09:52:30AM +0200, Jiri Pirko wrote:
>>> Tue, Jun 05, 2018 at 12:15:03AM CEST, dsahern@kernel.org wrote:
>>> > 	if (!mlxsw_sp_port->split) {
>>> > 		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
>>> >+		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
>>> 
>>> I wonder if we need the dmesg for these as well. Plus it is not the same
>>> (wasn't/was not) which is maybe confusing. Any objection against the
>>> original dmesg messages removal?
>>
>>We had this discussion about three months ago and decided to keep the
>>existing messages:
>>https://marc.info/?l=linux-netdev&m=151982813309466&w=2
> 
> I forgot. Thanks for reminding me. So could we at least have the
> messages 100% same? Thanks.

Seems like a reasonable request, David A.?

^ permalink raw reply

* Re: [PATCH net] sctp: not allow transport timeout value less than HZ/5 for hb_timer
From: David Miller @ 2018-06-05 14:23 UTC (permalink / raw)
  To: lucien.xin
  Cc: netdev, linux-sctp, edumazet, marcelo.leitner, nhorman, dvyukov,
	syzkaller
In-Reply-To: <97b99fac474db414ea8486a1fbd3a37dacd4b1b1.1528172218.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Tue,  5 Jun 2018 12:16:58 +0800

> syzbot reported a rcu_sched self-detected stall on CPU which is caused
> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
> value, hb_timer will get stuck there, as in its timer handler it starts
> this timer again with this value, then goes to the timer handler again.
> 
> This problem is there since very beginning, and thanks to Eric for the
> reproducer shared from a syzbot mail.
> 
> This patch fixes it by not allowing sctp_transport_timeout to return a
> smaller value than HZ/5 for hb_timer, which is based on TCP's min rto.
> 
> Note that it doesn't fix this issue by limiting rto_min, as some users
> are still using small rto and no proper value was found for it yet.
> 
> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable, thanks Xin.

^ permalink raw reply

* Re: [PATCH net-next] bpfilter: switch to CC from HOSTCC
From: David Miller @ 2018-06-05 14:21 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev, linux-kernel, kernel-team, arnd, yamada.masahiro
In-Reply-To: <20180605025341.3965492-1-ast@kernel.org>

From: Alexei Starovoitov <ast@kernel.org>
Date: Mon, 4 Jun 2018 19:53:41 -0700

> check that CC can build executables and use that compiler instead of HOSTCC
> 
> Suggested-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Applied, thanks Alexei.

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5e: fix error return code in mlx5e_alloc_rq()
From: David Miller @ 2018-06-05 14:20 UTC (permalink / raw)
  To: weiyongjun1; +Cc: saeedm, leon, tariqt, netdev, linux-rdma, kernel-janitors
In-Reply-To: <1528166576-55047-1-git-send-email-weiyongjun1@huawei.com>

From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Tue, 5 Jun 2018 02:42:56 +0000

> Fix to return error code -ENOMEM from the kvzalloc_node() error handling
> case instead of 0, as done elsewhere in this function.
> 
> Fixes: 069d11465a80 ("net/mlx5e: RX, Enhance legacy Receive Queue memory scheme")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5e: Make function mlx5e_change_rep_mtu() static
From: David Miller @ 2018-06-05 14:20 UTC (permalink / raw)
  To: weiyongjun1; +Cc: saeedm, leon, adin, netdev, linux-rdma, kernel-janitors
In-Reply-To: <1528166565-54828-1-git-send-email-weiyongjun1@huawei.com>

From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Tue, 5 Jun 2018 02:42:45 +0000

> Fixes the following sparse warning:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:903:5: warning:
>  symbol 'mlx5e_change_rep_mtu' was not declared. Should it be static?
> 
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2] net: qualcomm: rmnet: Fix use after free while sending command ack
From: David Miller @ 2018-06-05 14:17 UTC (permalink / raw)
  To: subashab; +Cc: netdev
In-Reply-To: <1528163018-16113-1-git-send-email-subashab@codeaurora.org>

From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Date: Mon,  4 Jun 2018 19:43:38 -0600

> When sending an ack to a command packet, the skb is still referenced
> after it is sent to the real device. Since the real device could
> free the skb, the device pointer would be invalid.
> Also, remove an unnecessary variable.
> 
> Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2] net: ipv6: Generate random IID for addresses on RAWIP devices
From: David Miller @ 2018-06-05 14:17 UTC (permalink / raw)
  To: subashab; +Cc: netdev, yoshfuji, stranche
In-Reply-To: <1528161967-6382-1-git-send-email-subashab@codeaurora.org>

From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Date: Mon,  4 Jun 2018 19:26:07 -0600

> RAWIP devices such as rmnet do not have a hardware address and
> instead require the kernel to generate a random IID for the
> IPv6 addresses.
> 
> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>

Applied.

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: David Miller @ 2018-06-05 14:15 UTC (permalink / raw)
  To: andrew
  Cc: kai.heng.feng, hayeswang, hkallweit1, romieu, netdev,
	linux-kernel, ryankao
In-Reply-To: <20180605141114.GC14873@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Tue, 5 Jun 2018 16:11:14 +0200

> No module parameter please. Just turn it on by default. Assuming
> testing shows works.

Agreed.

^ permalink raw reply

* bpf-next is CLOSED
From: Daniel Borkmann @ 2018-06-05 14:14 UTC (permalink / raw)
  To: netdev; +Cc: ast

Please only submit bug fixes at this time due to merge window, thank you.

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Daniel Borkmann @ 2018-06-05 14:13 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev
In-Reply-To: <152819729342.9696.4421334230852378808.stgit@alrua-kau>

Hi Toke,

On 06/05/2018 01:14 PM, Toke Høiland-Jørgensen wrote:
> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>

Please no empty commit message. Not sure why from the previous patch
you removed it here.

> ---
>  tools/testing/selftests/bpf/trace_helpers.c |   47 ++++++++++++++++++++++++++-
>  tools/testing/selftests/bpf/trace_helpers.h |    4 ++
>  2 files changed, 49 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
> index 3868dcb63420..1e62d89f34cf 100644
> --- a/tools/testing/selftests/bpf/trace_helpers.c
> +++ b/tools/testing/selftests/bpf/trace_helpers.c
> @@ -88,7 +88,7 @@ static int page_size;
>  static int page_cnt = 8;
>  static struct perf_event_mmap_page *header;
>  
> -int perf_event_mmap(int fd)
> +int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header)
>  {
>  	void *base;
>  	int mmap_size;
> @@ -102,10 +102,15 @@ int perf_event_mmap(int fd)
>  		return -1;
>  	}
>  
> -	header = base;
> +	*header = base;
>  	return 0;
>  }
>  
> +int perf_event_mmap(int fd)
> +{
> +	return perf_event_mmap_header(fd, &header);
> +}
> +
>  static int perf_event_poll(int fd)
>  {
>  	struct pollfd pfd = { .fd = fd, .events = POLLIN };
> @@ -163,3 +168,41 @@ int perf_event_poller(int fd, perf_event_print_fn output_fn)
>  
>  	return ret;
>  }
> +
> +int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
> +			    int num_fds, perf_event_print_fn output_fn)
> +{
> +	enum bpf_perf_event_ret ret;
> +	struct pollfd *pfds;
> +	void *buf = NULL;
> +	size_t len = 0;
> +	int i;
> +
> +	pfds = malloc(sizeof(*pfds) * num_fds);
> +	if (!pfds)
> +		return -1;

Also, just noticed here you mix -1 as return code with LIBBPF_* return
codes. Would be better not not overlap such usage.

> +	memset(pfds, 0, sizeof(*pfds) * num_fds);
> +	for (i = 0; i < num_fds; i++) {
> +		pfds[i].fd = fds[i];
> +		pfds[i].events = POLLIN;
> +	}
> +
> +	for (;;) {
> +		poll(pfds, num_fds, 1000);
> +		for (i = 0; i < num_fds; i++) {
> +			if (pfds[i].revents) {
> +				ret = bpf_perf_event_read_simple(headers[i], page_cnt * page_size,
> +								page_size, &buf, &len,
> +								bpf_perf_event_print,
> +								output_fn);
> +				if (ret != LIBBPF_PERF_EVENT_CONT)
> +					break;
> +			}
> +		}
> +	}
> +	free(buf);
> +	free(pfds);
> +
> +	return ret;

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: Andrew Lunn @ 2018-06-05 14:11 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: davem, hayeswang, hkallweit1, romieu, netdev, linux-kernel,
	Ryankao
In-Reply-To: <20180605045812.17977-1-kai.heng.feng@canonical.com>

On Tue, Jun 05, 2018 at 12:58:12PM +0800, Kai-Heng Feng wrote:
> This patch reinstate ALDPS and ASPM support on r8169.
> 
> On some Intel platforms, ASPM support on r8169 is the key factor to let
> Package C-State achieve PC8. Without ASPM support, the deepest Package
> C-State can hit is PC3. PC8 can save additional ~3W in comparison with
> PC3.
> 
> This patch is from Realtek.
> 
> Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
> Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request settings")
> 
> Cc: Ryankao <ryankao@realtek.com>
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/net/ethernet/realtek/r8169.c | 190 +++++++++++++++++++++------
>  1 file changed, 151 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 75dfac0248f4..a28ef20be221 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -319,6 +319,8 @@ static const struct pci_device_id rtl8169_pci_tbl[] = {
>  
>  MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);
>  
> +static int enable_aspm = 1;
> +static int enable_aldps = 1;
>  static int use_dac = -1;
>  static struct {
>  	u32 msg_enable;
> @@ -817,6 +819,10 @@ struct rtl8169_private {
>  
>  MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
>  MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
> +module_param(enable_aspm, int, 0);
> +MODULE_PARM_DESC(enable_aspm, "Enable ASPM");
> +module_param(enable_aldps, int, 0);
> +MODULE_PARM_DESC(enable_aldps, "Enable ALDPS");

Hi Kai

No module parameter please. Just turn it on by default. Assuming
testing shows works.

	Andrew

^ permalink raw reply

* Re: AF_XDP. Was: [net-next 00/12][pull request] Intel Wired LAN Driver Updates 2018-06-04
From: Daniel Borkmann @ 2018-06-05 14:11 UTC (permalink / raw)
  To: Björn Töpel, Alexander Duyck
  Cc: Alexei Starovoitov, David Miller, Björn Töpel,
	Karlsson, Magnus, ast, Daniel Borkmann, Or Gerlitz, Jeff Kirsher,
	Netdev
In-Reply-To: <CAJ+HfNhBZvARUwrr838Dc6eZZZ0LjkWjaGtAHuO-5UjHXwfSMQ@mail.gmail.com>

On 06/05/2018 10:44 AM, Björn Töpel wrote:
> Den tis 5 juni 2018 kl 03:46 skrev Alexander Duyck <alexander.duyck@gmail.com>:
>> On Mon, Jun 4, 2018 at 4:32 PM, Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>> On Mon, Jun 04, 2018 at 03:02:31PM -0700, Alexander Duyck wrote:
>>>> On Mon, Jun 4, 2018 at 2:27 PM, David Miller <davem@davemloft.net> wrote:
>>>>> From: Or Gerlitz <gerlitz.or@gmail.com>
>>>>> Date: Tue, 5 Jun 2018 00:11:35 +0300
>>>>>
>>>>>> Just to make sure, is the AF_XDP ZC (Zero Copy) UAPI going to be
>>>>>> merged for this window -- AFAIU from [1], it's still under
>>>>>> examination/development/research for non Intel HWs, am I correct or
>>>>>> this is going to get in now?
>>>>>
>>>>> All of the pending AF_XDP changes will be merged this merge window.
>>>>>
>>>>> I think Intel folks need to review things as fast as possible because
>>>>> I pretty much refuse to revert the series or disable it in Kconfig at
>>>>> this point.
>>>>>
>>>>> Thank you.
>>>>
>>>> My understanding of things is that the current AF_XDP patches were
>>>> going to be updated to have more of a model agnostic API such that
>>>> they would work for either the "typewriter" mode or the descriptor
>>>> ring based approach. The current plan was to have the zero copy
>>>> patches be a follow-on after the vendor agnostic API bits in the
>>>> descriptors and such had been sorted out. I believe you guys have the
>>>> descriptor fixes already right?
>>>>
>>>> In my opinion the i40e code isn't mature enough yet to really go into
>>>> anything other than maybe net-next in a couple weeks. We are going to
>>>> need a while to get adequate testing in order to flush out all the
>>>> bugs and performance regressions we are likely to see coming out of
>>>> this change.
>>>
>>> I think the work everyone did in this release cycle increased my confidence
>>> that the way descriptors are defined and the rest of uapi are stable enough
>>> and i40e zero copy bits can land in the next release without uapi changes.
>>> In that sense even if we merge i40e parts now, the other nic vendors
>>> will be in the same situation and may find things that they would like
>>> to improve in uapi.
>>> So I propose we merge the first 7 patches of the last series now and
>>> let 3 remaining i40e patches go via intel trees for the next release.
>>> In the mean time other NIC vendors should start actively working
>>> on AF_XDP support as well.
>>> If somehow uapi would need tweaks, we can still do minor adjustments
>>> since 4.18 won't be released for ~10 weeks.
>>
>> That works for me. Actually I think patch 11 can probably be included
>> as well since that is just sample code and could probably be used by
>> whatever drivers end up implementing this.
> 
> The approach suggested by Alexei and Alex sounds good to us. Alex's
> review items are very much valid, and require more time to address.
> Therefore addressing i40e in the next merge windows sounds like a
> great idea.
> 
> As Alex suggests, including patch 11 together with the first seven makes sense.

Ok with it as well, and I've pushed just that, thanks everyone!

^ permalink raw reply

* Re: [PATCH net-next] tcp: refactor tcp_ecn_check_ce to remove sk type cast
From: David Miller @ 2018-06-05 14:10 UTC (permalink / raw)
  To: ysseung; +Cc: netdev, ncardwell, ycheng, edumazet
In-Reply-To: <20180604222951.229735-1-ysseung@google.com>

From: Yousuk Seung <ysseung@google.com>
Date: Mon,  4 Jun 2018 15:29:51 -0700

> Refactor tcp_ecn_check_ce and __tcp_ecn_check_ce to accept struct sock*
> instead of tcp_sock* to clean up type casts. This is a pure refactor
> patch.
> 
> Signed-off-by: Yousuk Seung <ysseung@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

Applied.

^ permalink raw reply

* Re: [PATCH 09/10] dpaa_eth: add support for hardware timestamping
From: Richard Cochran @ 2018-06-05 13:57 UTC (permalink / raw)
  To: Y.b. Lu
  Cc: netdev@vger.kernel.org, Madalin-cristian Bucur, Rob Herring,
	Shawn Guo, David S . Miller, devicetree@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <DB6PR0401MB2536376432EC481473A5B4A3F8660@DB6PR0401MB2536.eurprd04.prod.outlook.com>

On Tue, Jun 05, 2018 at 03:35:28AM +0000, Y.b. Lu wrote:
> [Y.b. Lu] Actually these timestamping codes affected DPAA networking performance in our previous performance test.
> That's why we used ifdef for it.

How much does time stamping hurt performance?

If the time stamping is compiled in but not enabled at run time, does
it still affect performace?

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH net] net/ipv6: prevent use after free in ip6_route_mpath_notify
From: David Miller @ 2018-06-05 13:57 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, dsahern, edumazet
In-Reply-To: <20180604204142.8941-1-dsahern@kernel.org>

From: dsahern@kernel.org
Date: Mon,  4 Jun 2018 13:41:42 -0700

> From: David Ahern <dsahern@gmail.com>
> 
> syzbot reported a use-after-free:
 ...
> The problem is that rt_last can point to a deleted route if the insert
> fails.
> 
> One reproducer is to insert a route and then add a multipath route that
> has a duplicate nexthop.e.g,:
>     $ ip -6 ro add vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::2
>     $ ip -6 ro append vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::4 nexthop via 2001:db8:1::2
> 
> Fix by not setting rt_last until the it is verified the insert succeeded.
> 
> Fixes: 3b1137fe7482 ("net: ipv6: Change notifications for multipath add to RTA_MULTIPATH")
> Cc: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Signed-off-by: David Ahern <dsahern@gmail.com>

Applied and queued up for -stable, thanks David.

^ permalink raw reply

* Re: INFO: task hung in ip6gre_exit_batch_net
From: Kirill Tkhai @ 2018-06-05 13:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Christian Brauner, David Miller, David Ahern,
	Florian Westphal, Jiri Benc, LKML, Xin Long, mschiffer, netdev,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <CACT4Y+bPrdK7xSeUk=pgVVyaseDZUAvArBA4bEWPsUi7CE3EVw@mail.gmail.com>

On 05.06.2018 12:36, Dmitry Vyukov wrote:
> On Tue, Jun 5, 2018 at 11:03 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>> Hi, Dmirty!
>>
>> On 04.06.2018 18:22, Dmitry Vyukov wrote:
>>> On Mon, Jun 4, 2018 at 5:03 PM, syzbot
>>> <syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com> wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following crash on:
>>>>
>>>> HEAD commit:    bc2dbc5420e8 Merge branch 'akpm' (patches from Andrew)
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=164e42b7800000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=982e2df1b9e60b02
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf78a74f82c1cf19069e
>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>
>>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>>
>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>> Reported-by: syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com
>>>
>>> Another hang on rtnl lock:
>>>
>>> #syz dup: INFO: task hung in netdev_run_todo
>>>
>>> May be related to "unregister_netdevice: waiting for DEV to become free":
>>> https://syzkaller.appspot.com/bug?id=1a97a5bd119fd97995f752819fd87840ab9479a9
> 
> netdev_wait_allrefs does not hold rtnl lock during waiting, so it must
> be something different.
> 
> 
>>> Any other explanations for massive hangs on rtnl lock for minutes?
>>
>> To exclude the situation, when a task exists with rtnl_mutex held:
>>
>> would the pr_warn() from print_held_locks_bug() be included in the console output
>> if they appear?
> 
> Yes, everything containing "WARNING:" is detected as bug.

OK, then dead task not releasing the lock is excluded.

One more assumption: someone corrupted memory around rtnl_mutex and it looks like locked.
(I track lockdep "(rtnl_mutex){+.+.}" prints in initial message as "nobody owns rtnl_mutex").
There may help a crash dump of the VM.

Also, there may be a locking code BUG, but this seems the least probable for me.

Kirill

^ permalink raw reply

* Re: [PATCH net-next] net: phy: broadcom: Enable 125 MHz clock on LED4 pin for BCM54612E by default.
From: David Miller @ 2018-06-05 13:43 UTC (permalink / raw)
  To: kunyi
  Cc: netdev, Avi.Fishman, tali.perry, tomer.maimon, benjaminfair,
	rlippert, f.fainelli
In-Reply-To: <20180604201704.238472-1-kunyi@google.com>

From: Kun Yi <kunyi@google.com>
Date: Mon,  4 Jun 2018 13:17:04 -0700

> BCM54612E have 4 multi-functional LED pins that can be configured
> through register setting; the LED4 pin can be configured to a 125MHz
> reference clock output by setting the spare register. Since the dedicated
> CLK125 reference clock pin is not brought out on the 48-Pin MLP, the LED4
> pin is the only pin to provide such function in this package, and therefore
> it is beneficial to just enable the reference clock by default.
> 
> Signed-off-by: Kun Yi <kunyi@google.com>

Applied, thank you.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox