Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5 2/3] media: rc: introduce BPF_PROG_LIRC_MODE2
From: Sean Young @ 2018-06-05 10:16 UTC (permalink / raw)
  To: Matthias Reichl, linux-media, linux-kernel, Alexei Starovoitov,
	Mauro Carvalho Chehab, Daniel Borkmann, netdev, Devin Heitmueller,
	Y Song, Quentin Monnet
In-Reply-To: <20180604174730.sctfoklq7klswebp@camel2.lan>

On Mon, Jun 04, 2018 at 07:47:30PM +0200, Matthias Reichl wrote:
> Hi Sean,
> 
> I finally found the time to test your patch series and noticed
> 2 issues - comments are inline
> 
> On Sun, May 27, 2018 at 12:24:09PM +0100, Sean Young wrote:
> > diff --git a/drivers/media/rc/Kconfig b/drivers/media/rc/Kconfig
> > index eb2c3b6eca7f..d5b35a6ba899 100644
> > --- a/drivers/media/rc/Kconfig
> > +++ b/drivers/media/rc/Kconfig
> > @@ -25,6 +25,19 @@ config LIRC
> >  	   passes raw IR to and from userspace, which is needed for
> >  	   IR transmitting (aka "blasting") and for the lirc daemon.
> >  
> > +config BPF_LIRC_MODE2
> > +	bool "Support for eBPF programs attached to lirc devices"
> > +	depends on BPF_SYSCALL
> > +	depends on RC_CORE=y
> 
> Requiring rc-core to be built into the kernel could become
> problematic in the future for people using media_build.
> 
> Currently the whole media tree (including rc-core) can be built
> as modules so DVB and IR drivers can be replaced by newer versions.
> But with rc-core in the kernel things could easily break if internal
> data structures are changed.
> 
> Maybe we should add a small layer with a stable API/ABI between
> bpf-lirc and rc-core to decouple them? Or would it be possible
> to build rc-core with bpf support as a module?

Unfortunately bpf cannot be built as a module.

> > +	depends on LIRC
> > +	help
> > +	   Allow attaching eBPF programs to a lirc device using the bpf(2)
> > +	   syscall command BPF_PROG_ATTACH. This is supported for raw IR
> > +	   receivers.
> > +
> > +	   These eBPF programs can be used to decode IR into scancodes, for
> > +	   IR protocols not supported by the kernel decoders.
> > +
> >  menuconfig RC_DECODERS
> >  	bool "Remote controller decoders"
> >  	depends on RC_CORE
> > [...]
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 388d4feda348..3c104113d040 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -11,6 +11,7 @@
> >   */
> >  #include <linux/bpf.h>
> >  #include <linux/bpf_trace.h>
> > +#include <linux/bpf_lirc.h>
> >  #include <linux/btf.h>
> >  #include <linux/syscalls.h>
> >  #include <linux/slab.h>
> > @@ -1578,6 +1579,8 @@ static int bpf_prog_attach(const union bpf_attr *attr)
> >  	case BPF_SK_SKB_STREAM_PARSER:
> >  	case BPF_SK_SKB_STREAM_VERDICT:
> >  		return sockmap_get_from_fd(attr, BPF_PROG_TYPE_SK_SKB, true);
> > +	case BPF_LIRC_MODE2:
> > +		return lirc_prog_attach(attr);
> >  	default:
> >  		return -EINVAL;
> >  	}
> > @@ -1648,6 +1651,8 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> >  	case BPF_SK_SKB_STREAM_PARSER:
> >  	case BPF_SK_SKB_STREAM_VERDICT:
> >  		return sockmap_get_from_fd(attr, BPF_PROG_TYPE_SK_SKB, false);
> > +	case BPF_LIRC_MODE2:
> > +		return lirc_prog_detach(attr);
> >  	default:
> >  		return -EINVAL;
> >  	}
> > @@ -1695,6 +1700,8 @@ static int bpf_prog_query(const union bpf_attr *attr,
> >  	case BPF_CGROUP_SOCK_OPS:
> >  	case BPF_CGROUP_DEVICE:
> >  		break;
> > +	case BPF_LIRC_MODE2:
> > +		return lirc_prog_query(attr, uattr);
> 
> When testing this patch series I was wondering why I always got
> -EINVAL when trying to query the registered programs.
> 
> Closer inspection revealed that bpf_prog_attach/detach/query and
> calls to them in the bpf syscall are in "#ifdef CONFIG_CGROUP_BPF"
> blocks - and as I built the kernel without CONFIG_CGROUP_BPF
> BPF_PROG_ATTACH/DETACH/QUERY weren't handled in the syscall switch
> and I got -EINVAL from the bpf syscall function.
> 
> I haven't checked in detail yet, but it looks to me like
> bpf_prog_attach/detach/query could always be built (or when
> either cgroup bpf or lirc bpf are enabled) and the #ifdefs moved
> inside the switch(). So lirc bpf could be used without cgroup bpf.
> Or am I missing something?

You are right, this features depends on CONFIG_CGROUP_BPF right now. This
also affects the BPF_SK_MSG_VERDICT, BPF_SK_SKB_STREAM_VERDICT and
BPF_SK_SKB_STREAM_PARSER type bpf attachments, and as far as I know
these shouldn't depend on CONFIG_CGROUP_BPF either.


Sean

^ permalink raw reply

* Re: [PATCH net] ipmr: fix error path when mr_table_alloc fails
From: Sabrina Dubroca @ 2018-06-05 10:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, edumazet, nikolay, yuvalm, ivecera
In-Reply-To: <20180604.172514.1940145189918116408.davem@davemloft.net>

2018-06-04, 17:25:14 -0400, David Miller wrote:
> From: Sabrina Dubroca <sd@queasysnail.net>
> Date: Mon,  4 Jun 2018 13:55:54 +0200
> 
> > commit 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")
> > refactored ipmr_new_table, so that it now returns NULL when
> > mr_table_alloc fails. Unfortunately, all callers of ipmr_new_table
> > expect an ERR_PTR. commit 66fb33254f45 ("ipmr: properly check
> > rhltable_init() return value") followed suit.
> > 
> > This can result in NULL deref, when ipmr_rules_exit calls
> > ipmr_free_table with NULL net->ipv4.mrt in the
> > !CONFIG_IP_MROUTE_MULTIPLE_TABLES version.
> > 
> > This patch makes mr_table_alloc return errors, and changes
> > ip6mr_new_table and its callers to return/expect error pointers as
> > well. It also removes the version of mr_table_alloc defined under
> > !CONFIG_IP_MROUTE_COMMON, since it is never used.
> > 
> > Fixes: 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")
> > Fixes: 66fb33254f45 ("ipmr: properly check rhltable_init() return value")
> > Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> 
> This adds a new warning with gcc-8.1.1 on Fedora 28
> 
>   CC [M]  net/ipv6/ip6mr.o
> In file included from ./arch/x86/include/asm/current.h:5,
>                  from ./include/linux/sched.h:12,
>                  from ./include/linux/uaccess.h:5,
>                  from net/ipv6/ip6mr.c:19:
> net/ipv6/ip6mr.c: In function ‘ip6_mroute_setsockopt’:
> ./include/linux/compiler.h:177:26: warning: ‘mrt’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>   case 8: *(__u64 *)res = *(volatile __u64 *)p; break;  \
>                           ^
> net/ipv6/ip6mr.c:1752:20: note: ‘mrt’ was declared here
>    struct mr_table *mrt;
>                     ^~~

grmbl, CONFIG_UBSAN disables -Wmaybe-uninitialized. I'll prepare a v2,
sorry.

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH net] sctp: not allow transport timeout value less than HZ/5 for hb_timer
From: Neil Horman @ 2018-06-05 10:27 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, linux-sctp, davem, Eric Dumazet,
	Marcelo Ricardo Leitner, Dmitry Vyukov, syzkaller
In-Reply-To: <97b99fac474db414ea8486a1fbd3a37dacd4b1b1.1528172218.git.lucien.xin@gmail.com>

On Tue, Jun 05, 2018 at 12:16:58PM +0800, Xin Long wrote:
> syzbot reported a rcu_sched self-detected stall on CPU which is caused
> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
> value, hb_timer will get stuck there, as in its timer handler it starts
> this timer again with this value, then goes to the timer handler again.
> 
> This problem is there since very beginning, and thanks to Eric for the
> reproducer shared from a syzbot mail.
> 
> This patch fixes it by not allowing sctp_transport_timeout to return a
> smaller value than HZ/5 for hb_timer, which is based on TCP's min rto.
> 
> Note that it doesn't fix this issue by limiting rto_min, as some users
> are still using small rto and no proper value was found for it yet.
> 
> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sctp/transport.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index 47f82bd..03fc2c4 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -634,7 +634,7 @@ unsigned long sctp_transport_timeout(struct sctp_transport *trans)
>  	    trans->state != SCTP_PF)
>  		timeout += trans->hbinterval;
>  
> -	return timeout;
> +	return max_t(unsigned long, timeout, HZ / 5);
>  }
>  
>  /* Reset transport variables to their initial values */
> -- 
> 2.1.0
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* [RFC PATCH] kcm: hold rx mux lock when updating the receive queue.
From: Paolo Abeni @ 2018-06-05 10:32 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Tom Herbert, Kirill Tkhai

Currently kcm holds both the RX mux lock and the socket lock when
updating the sk receive queue, except in some notable cases:

- kcm_rfree holds only the RX mux lock
- kcm_recvmsg holds only the socket lock

has results there are possible races which cause receive queue
corruption, as reported by the syzbot.

Since we can't acquire the socket lock in kcm_rfree, let's use
the RX mux lock to protect the receive queue update in kcm_recvmsg,
too. Also, let's add some commit noting which is the locking schema in use.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-and-tested-by: syzbot+278279efdd2730dd14bf@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
This is an RFC, since I'm really new to this area, anyway the syzport
reported success in testing the proposed fix.
This is very likely a scenario where the hopefully upcoming 
skb->prev,next->list_head conversion would have helped a lot, thanks to 
list poisoning and list debug
---
 net/kcm/kcmsock.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index d3601d421571..95e1d95ab24a 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -188,6 +188,7 @@ static void kcm_rfree(struct sk_buff *skb)
 	}
 }
 
+/* RX mux lock held */
 static int kcm_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct sk_buff_head *list = &sk->sk_receive_queue;
@@ -1157,7 +1158,9 @@ static int kcm_recvmsg(struct socket *sock, struct msghdr *msg,
 			/* Finished with message */
 			msg->msg_flags |= MSG_EOR;
 			KCM_STATS_INCR(kcm->stats.rx_msgs);
+			spin_lock_bh(&kcm->mux->rx_lock);
 			skb_unlink(skb, &sk->sk_receive_queue);
+			spin_unlock_bh(&kcm->mux->rx_lock);
 			kfree_skb(skb);
 		}
 	}
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v5 2/3] media: rc: introduce BPF_PROG_LIRC_MODE2
From: Daniel Borkmann @ 2018-06-05 10:33 UTC (permalink / raw)
  To: Sean Young, Matthias Reichl, linux-media, linux-kernel,
	Alexei Starovoitov, Mauro Carvalho Chehab, netdev,
	Devin Heitmueller, Y Song, Quentin Monnet
In-Reply-To: <20180605101629.yffyp64o7adg6hu5@gofer.mess.org>

On 06/05/2018 12:16 PM, Sean Young wrote:
> On Mon, Jun 04, 2018 at 07:47:30PM +0200, Matthias Reichl wrote:
[...]
>>> @@ -1695,6 +1700,8 @@ static int bpf_prog_query(const union bpf_attr *attr,
>>>  	case BPF_CGROUP_SOCK_OPS:
>>>  	case BPF_CGROUP_DEVICE:
>>>  		break;
>>> +	case BPF_LIRC_MODE2:
>>> +		return lirc_prog_query(attr, uattr);
>>
>> When testing this patch series I was wondering why I always got
>> -EINVAL when trying to query the registered programs.
>>
>> Closer inspection revealed that bpf_prog_attach/detach/query and
>> calls to them in the bpf syscall are in "#ifdef CONFIG_CGROUP_BPF"
>> blocks - and as I built the kernel without CONFIG_CGROUP_BPF
>> BPF_PROG_ATTACH/DETACH/QUERY weren't handled in the syscall switch
>> and I got -EINVAL from the bpf syscall function.
>>
>> I haven't checked in detail yet, but it looks to me like
>> bpf_prog_attach/detach/query could always be built (or when
>> either cgroup bpf or lirc bpf are enabled) and the #ifdefs moved
>> inside the switch(). So lirc bpf could be used without cgroup bpf.
>> Or am I missing something?
> 
> You are right, this features depends on CONFIG_CGROUP_BPF right now. This
> also affects the BPF_SK_MSG_VERDICT, BPF_SK_SKB_STREAM_VERDICT and
> BPF_SK_SKB_STREAM_PARSER type bpf attachments, and as far as I know
> these shouldn't depend on CONFIG_CGROUP_BPF either.

The latter three do depend on it from a bigger picture as sockmap progs
are orchestrated via cgroups. But I'd be fine if you decouple lirc from
it since there's really no dependency at all, and presumably there are
valid cases where you would want to run it on low-end devices with minimal
kernel.

Thanks,
Daniel

^ permalink raw reply

* [PATCH bpf-next v3 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-05 11:14 UTC (permalink / raw)
  To: netdev

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 tools/testing/selftests/bpf/trace_helpers.c |   47 ++++++++++++++++++++++++++-
 tools/testing/selftests/bpf/trace_helpers.h |    4 ++
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 3868dcb63420..1e62d89f34cf 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -88,7 +88,7 @@ static int page_size;
 static int page_cnt = 8;
 static struct perf_event_mmap_page *header;
 
-int perf_event_mmap(int fd)
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header)
 {
 	void *base;
 	int mmap_size;
@@ -102,10 +102,15 @@ int perf_event_mmap(int fd)
 		return -1;
 	}
 
-	header = base;
+	*header = base;
 	return 0;
 }
 
+int perf_event_mmap(int fd)
+{
+	return perf_event_mmap_header(fd, &header);
+}
+
 static int perf_event_poll(int fd)
 {
 	struct pollfd pfd = { .fd = fd, .events = POLLIN };
@@ -163,3 +168,41 @@ int perf_event_poller(int fd, perf_event_print_fn output_fn)
 
 	return ret;
 }
+
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn)
+{
+	enum bpf_perf_event_ret ret;
+	struct pollfd *pfds;
+	void *buf = NULL;
+	size_t len = 0;
+	int i;
+
+	pfds = malloc(sizeof(*pfds) * num_fds);
+	if (!pfds)
+		return -1;
+
+	memset(pfds, 0, sizeof(*pfds) * num_fds);
+	for (i = 0; i < num_fds; i++) {
+		pfds[i].fd = fds[i];
+		pfds[i].events = POLLIN;
+	}
+
+	for (;;) {
+		poll(pfds, num_fds, 1000);
+		for (i = 0; i < num_fds; i++) {
+			if (pfds[i].revents) {
+				ret = bpf_perf_event_read_simple(headers[i], page_cnt * page_size,
+								page_size, &buf, &len,
+								bpf_perf_event_print,
+								output_fn);
+				if (ret != LIBBPF_PERF_EVENT_CONT)
+					break;
+			}
+		}
+	}
+	free(buf);
+	free(pfds);
+
+	return ret;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 3b4bcf7f5084..18924f23db1b 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -3,6 +3,7 @@
 #define __TRACE_HELPER_H
 
 #include <libbpf.h>
+#include <linux/perf_event.h>
 
 struct ksym {
 	long addr;
@@ -16,6 +17,9 @@ long ksym_get_addr(const char *name);
 typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
 
 int perf_event_mmap(int fd);
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header);
 /* return LIBBPF_PERF_EVENT_DONE or LIBBPF_PERF_EVENT_ERROR */
 int perf_event_poller(int fd, perf_event_print_fn output_fn);
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn);
 #endif

^ permalink raw reply related

* [PATCH bpf-next v3 2/2] samples/bpf: Add xdp_sample_pkts example
From: Toke Høiland-Jørgensen @ 2018-06-05 11:14 UTC (permalink / raw)
  To: netdev
In-Reply-To: <152819729342.9696.4421334230852378808.stgit@alrua-kau>

This adds an example program showing how to sample packets from XDP using
the perf event buffer. The example userspace program just prints the
ethernet header for every packet sampled.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 samples/bpf/Makefile               |    4 +
 samples/bpf/xdp_sample_pkts_kern.c |   62 +++++++++++++
 samples/bpf/xdp_sample_pkts_user.c |  176 ++++++++++++++++++++++++++++++++++++
 3 files changed, 242 insertions(+)
 create mode 100644 samples/bpf/xdp_sample_pkts_kern.c
 create mode 100644 samples/bpf/xdp_sample_pkts_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1303af10e54d..9ea2f7b64869 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -52,6 +52,7 @@ hostprogs-y += xdp_adjust_tail
 hostprogs-y += xdpsock
 hostprogs-y += xdp_fwd
 hostprogs-y += task_fd_query
+hostprogs-y += xdp_sample_pkts
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -107,6 +108,7 @@ xdp_adjust_tail-objs := xdp_adjust_tail_user.o
 xdpsock-objs := bpf_load.o xdpsock_user.o
 xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
 task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
+xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -163,6 +165,7 @@ always += xdp_adjust_tail_kern.o
 always += xdpsock_kern.o
 always += xdp_fwd_kern.o
 always += task_fd_query_kern.o
+always += xdp_sample_pkts_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -179,6 +182,7 @@ HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_xdp_sample_pkts_user.o += -I$(srctree)/tools/lib/bpf/
 
 HOST_LOADLIBES		+= $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4		+= -lrt
diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
new file mode 100644
index 000000000000..4560522ca015
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -0,0 +1,62 @@
+#include <linux/ptrace.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+#define SAMPLE_SIZE 64ul
+#define MAX_CPUS 24
+
+#define bpf_printk(fmt, ...)					\
+({								\
+	       char ____fmt[] = fmt;				\
+	       bpf_trace_printk(____fmt, sizeof(____fmt),	\
+				##__VA_ARGS__);			\
+})
+
+struct bpf_map_def SEC("maps") my_map = {
+	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(u32),
+	.max_entries = MAX_CPUS,
+};
+
+SEC("xdp_sample")
+int xdp_sample_prog(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+
+        /* Metadata will be in the perf event before the packet data. */
+	struct S {
+		u16 cookie;
+		u16 pkt_len;
+	} __attribute__((packed)) metadata;
+
+	if (data + SAMPLE_SIZE < data_end) {
+		/* The XDP perf_event_output handler will use the upper 32 bits
+		 * of the flags argument as a number of bytes to include of the
+		 * packet payload in the event data. If the size is too big, the
+		 * call to bpf_perf_event_output will fail and return -EFAULT.
+		 *
+		 * See bpf_xdp_event_output in net/core/filter.c.
+		 *
+		 * The BPF_F_CURRENT_CPU flag means that the event output fd
+		 * will be indexed by the CPU number in the event map.
+		 */
+		u64 flags = (SAMPLE_SIZE << 32) | BPF_F_CURRENT_CPU;
+		int ret;
+
+		metadata.cookie = 0xdead;
+		metadata.pkt_len = (u16)(data_end - data);
+
+		ret = bpf_perf_event_output(ctx, &my_map, flags,
+				      &metadata, sizeof(metadata));
+		if(ret)
+			bpf_printk("perf_event_output failed: %d\n", ret);
+	}
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
new file mode 100644
index 000000000000..672392d48ce3
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -0,0 +1,176 @@
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <stdio.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <linux/perf_event.h>
+#include <linux/bpf.h>
+#include <net/if.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/sysinfo.h>
+#include <sys/syscall.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <signal.h>
+#include <libbpf.h>
+#include <bpf/bpf.h>
+
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+#define MAX_CPUS 24
+static int pmu_fds[MAX_CPUS], if_idx = 0;
+static struct perf_event_mmap_page *headers[MAX_CPUS];
+static char *if_name;
+
+static int do_attach(int idx, int fd, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, fd, 0);
+	if (err < 0)
+		printf("ERROR: failed to attach program to %s\n", name);
+
+	return err;
+}
+
+static int do_detach(int idx, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, -1, 0);
+	if (err < 0)
+		printf("ERROR: failed to detach program from %s\n", name);
+
+	return err;
+}
+
+#define SAMPLE_SIZE 64
+
+static int print_bpf_output(void *data, int size)
+{
+	struct {
+		__u16 cookie;
+		__u16 pkt_len;
+		__u8  pkt_data[SAMPLE_SIZE];
+	} __attribute__((packed)) *e = data;
+	int i;
+
+	if (e->cookie != 0xdead) {
+		printf("BUG cookie %x sized %d\n",
+		       e->cookie, size);
+		return LIBBPF_PERF_EVENT_ERROR;
+	}
+
+	printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+	for (i = 0; i < 14 && i < e->pkt_len; i++)
+		printf("%02x ", e->pkt_data[i]);
+	printf("\n");
+
+	return LIBBPF_PERF_EVENT_CONT;
+}
+
+static void test_bpf_perf_event(int map_fd, int num)
+{
+	struct perf_event_attr attr = {
+		.sample_type = PERF_SAMPLE_RAW,
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_BPF_OUTPUT,
+		.wakeup_events = 1, /* get an fd notification for every event */
+	};
+	int i;
+
+	for (i = 0; i < num; i++) {
+		int key = i;
+
+		pmu_fds[i] = sys_perf_event_open(&attr, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
+
+		assert(pmu_fds[i] >= 0);
+		assert(bpf_map_update_elem(map_fd, &key, &pmu_fds[i], BPF_ANY) == 0);
+		ioctl(pmu_fds[i], PERF_EVENT_IOC_ENABLE, 0);
+	}
+}
+
+static void sig_handler(int signo)
+{
+	do_detach(if_idx, if_name);
+	exit(0);
+}
+
+int main(int argc, char **argv)
+{
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type	= BPF_PROG_TYPE_XDP,
+	};
+	struct bpf_object *obj;
+	struct bpf_map *map;
+	int prog_fd, map_fd;
+	char filename[256];
+	int ret, err, i;
+	int numcpus;
+
+	if (argc < 2) {
+		printf("Usage: %s <ifname>\n", argv[0]);
+		return 1;
+	}
+
+	numcpus = get_nprocs();
+	if (numcpus > MAX_CPUS)
+		numcpus = MAX_CPUS;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	prog_load_attr.file = filename;
+
+	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+		return 1;
+
+	if (!prog_fd) {
+		printf("load_bpf_file: %s\n", strerror(errno));
+		return 1;
+	}
+
+	map = bpf_map__next(NULL, obj);
+	if (!map) {
+		printf("finding a map in obj file failed\n");
+		return 1;
+	}
+	map_fd = bpf_map__fd(map);
+
+	if_idx = if_nametoindex(argv[1]);
+	if (!if_idx)
+		if_idx = strtoul(argv[1], NULL, 0);
+
+	if (!if_idx) {
+		fprintf(stderr, "Invalid ifname\n");
+		return 1;
+	}
+	if_name = argv[1];
+	err = do_attach(if_idx, prog_fd, argv[1]);
+	if (err)
+		return err;
+
+	if (signal(SIGINT, sig_handler) ||
+	    signal(SIGHUP, sig_handler) ||
+	    signal(SIGTERM, sig_handler)) {
+		perror("signal");
+		return 1;
+	}
+
+	test_bpf_perf_event(map_fd, numcpus);
+
+	for (i = 0; i < numcpus; i++)
+		if (perf_event_mmap_header(pmu_fds[i], &headers[i]) < 0)
+			return 1;
+
+	ret = perf_event_poller_multi(pmu_fds, headers, numcpus, print_bpf_output);
+	kill(0, SIGINT);
+	return ret;
+}

^ permalink raw reply related

* [PATCH] net: hns3: remove unused hclgevf_cfg_func_mta_filter
From: Arnd Bergmann @ 2018-06-05 11:38 UTC (permalink / raw)
  To: Yisen Zhuang, Salil Mehta
  Cc: Arnd Bergmann, David S. Miller, Peng Li, Fuyun Liang,
	Yunsheng Lin, Jian Shen, Xi Wang, netdev, linux-kernel

The last patch apparently added a complete replacement for this
function, but left the old one in place, which now causes a
harmless warning:

drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c:731:12: 'hclgevf_cfg_func_mta_filter' defined but not used

I assume it can be removed.

Fixes: 3a678b5806e6 ("net: hns3: Optimize the VF's process of updating multicast MAC")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index dd8e8e6718dc..bc8a5760d959 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -728,17 +728,6 @@ static void hclgevf_reset_tqp_stats(struct hnae3_handle *handle)
 	}
 }
 
-static int hclgevf_cfg_func_mta_filter(struct hnae3_handle *handle, bool en)
-{
-	struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
-	u8 msg[2] = {0};
-
-	msg[0] = en;
-	return hclgevf_send_mbx_msg(hdev, HCLGE_MBX_SET_MULTICAST,
-				    HCLGE_MBX_MAC_VLAN_MC_FUNC_MTA_ENABLE,
-				    msg, 1, false, NULL, 0);
-}
-
 static int hclgevf_cfg_func_mta_type(struct hclgevf_dev *hdev)
 {
 	u8 resp_msg = HCLGEVF_MTA_TYPE_SEL_MAX;
-- 
2.9.0

^ permalink raw reply related

* [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: Arnd Bergmann @ 2018-06-05 11:40 UTC (permalink / raw)
  To: Pablo Neira Ayuso, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI
  Cc: Máté Eckl, Arnd Bergmann, Paolo Abeni, Willem de Bruijn,
	Eric Dumazet, David Ahern, Martin KaFai Lau, netdev, linux-kernel

It is now possible to enable the libified nf_tproxy modules without
also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
ifdef logic in the udp core code:

net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'

We can actually simplify the conditions now to provide the two functions
exactly when they are needed.

Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 net/ipv4/udp.c | 4 +---
 net/ipv6/udp.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4f16e5d71875..3365362cac88 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -544,9 +544,7 @@ EXPORT_SYMBOL_GPL(udp4_lib_lookup_skb);
 /* Must be called under rcu_read_lock().
  * Does increment socket refcount.
  */
-#if IS_ENABLED(CONFIG_NETFILTER_XT_MATCH_SOCKET) || \
-    IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TPROXY) || \
-    IS_ENABLED(CONFIG_NF_SOCKET_IPV4)
+#if IS_ENABLED(CONFIG_NF_TPROXY_IPV4) || IS_ENABLED(CONFIG_NF_SOCKET_IPV4)
 struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
 			     __be32 daddr, __be16 dport, int dif)
 {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 967acff95bbe..164afd31aebf 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -285,9 +285,7 @@ EXPORT_SYMBOL_GPL(udp6_lib_lookup_skb);
 /* Must be called under rcu_read_lock().
  * Does increment socket refcount.
  */
-#if IS_ENABLED(CONFIG_NETFILTER_XT_MATCH_SOCKET) || \
-    IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TPROXY) || \
-    IS_ENABLED(CONFIG_NF_SOCKET_IPV6)
+#if IS_ENABLED(CONFIG_NF_TPROXY_IPV6) || IS_ENABLED(CONFIG_NF_SOCKET_IPV6)
 struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be16 sport,
 			     const struct in6_addr *daddr, __be16 dport, int dif)
 {
-- 
2.9.0

^ permalink raw reply related

* [PATCH net-next 0/3] Bug fixes & optimization for HNS3 Driver
From: Salil Mehta @ 2018-06-05 11:41 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm

This patch-set presents 2 priority bug fixes and an optimization for
HNS3 driver.

Xi Wang (3):
  net: hns3: Fix for VF mailbox cannot receiving PF response
  net: hns3: Fix for VF mailbox receiving unknown message
  net: hns3: Optimize PF CMDQ interrupt switching process

 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 13 ++++++++++++
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  3 +++
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c   | 23 +++++++++++++++++++---
 3 files changed, 36 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH net-next 1/3] net: hns3: Fix for VF mailbox cannot receiving PF response
From: Salil Mehta @ 2018-06-05 11:41 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Xi Wang
In-Reply-To: <20180605114201.29900-1-salil.mehta@huawei.com>

From: Xi Wang <wangxi11@huawei.com>

When the VF frequently switches the CMDQ interrupt, if the CMDQ_SRC is not
cleared, the VF will not receive the new PF response after the interrupt
is re-enabled, the corresponding log is as follows:

[  317.482222] hns3 0000:00:03.0: VF could not get mbx resp(=0) from PF
in 500 tries
[  317.483137] hns3 0000:00:03.0: VF request to get tqp info from PF
failed -5

This patch fixes this problem by clearing CMDQ_SRC before enabling
interrupt and syncing pending IRQ handlers after disabling interrupt.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index dd8e8e6..d55ee9c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1576,6 +1576,8 @@ static int hclgevf_misc_irq_init(struct hclgevf_dev *hdev)
 		return ret;
 	}
 
+	hclgevf_clear_event_cause(hdev, 0);
+
 	/* enable misc. vector(vector 0) */
 	hclgevf_enable_vector(&hdev->misc_vector, true);
 
@@ -1586,6 +1588,7 @@ static void hclgevf_misc_irq_uninit(struct hclgevf_dev *hdev)
 {
 	/* disable misc vector(vector 0) */
 	hclgevf_enable_vector(&hdev->misc_vector, false);
+	synchronize_irq(hdev->misc_vector.vector_irq);
 	free_irq(hdev->misc_vector.vector_irq, hdev);
 	hclgevf_free_vector(hdev, 0);
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 2/3] net: hns3: Fix for VF mailbox receiving unknown message
From: Salil Mehta @ 2018-06-05 11:42 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Xi Wang
In-Reply-To: <20180605114201.29900-1-salil.mehta@huawei.com>

From: Xi Wang <wangxi11@huawei.com>

Before the firmware updates the crq's tail pointer, if the VF driver
reads the data in the crq, the data may be incomplete at this time,
which will lead to the driver read an unknown message.

This patch fixes it by checking if crq is empty before reading the
message.

Fixes: b11a0bb231f3 ("net: hns3: Add mailbox support to VF driver")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c   | 23 +++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c
index a286184..173ca27 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c
@@ -126,6 +126,13 @@ int hclgevf_send_mbx_msg(struct hclgevf_dev *hdev, u16 code, u16 subcode,
 	return status;
 }
 
+static bool hclgevf_cmd_crq_empty(struct hclgevf_hw *hw)
+{
+	u32 tail = hclgevf_read_dev(hw, HCLGEVF_NIC_CRQ_TAIL_REG);
+
+	return tail == hw->cmq.crq.next_to_use;
+}
+
 void hclgevf_mbx_handler(struct hclgevf_dev *hdev)
 {
 	struct hclgevf_mbx_resp_status *resp;
@@ -140,11 +147,22 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev)
 	resp = &hdev->mbx_resp;
 	crq = &hdev->hw.cmq.crq;
 
-	flag = le16_to_cpu(crq->desc[crq->next_to_use].flag);
-	while (hnae_get_bit(flag, HCLGEVF_CMDQ_RX_OUTVLD_B)) {
+	while (!hclgevf_cmd_crq_empty(&hdev->hw)) {
 		desc = &crq->desc[crq->next_to_use];
 		req = (struct hclge_mbx_pf_to_vf_cmd *)desc->data;
 
+		flag = le16_to_cpu(crq->desc[crq->next_to_use].flag);
+		if (unlikely(!hnae3_get_bit(flag, HCLGEVF_CMDQ_RX_OUTVLD_B))) {
+			dev_warn(&hdev->pdev->dev,
+				 "dropped invalid mailbox message, code = %d\n",
+				 req->msg[0]);
+
+			/* dropping/not processing this invalid message */
+			crq->desc[crq->next_to_use].flag = 0;
+			hclge_mbx_ring_ptr_move_crq(crq);
+			continue;
+		}
+
 		/* synchronous messages are time critical and need preferential
 		 * treatment. Therefore, we need to acknowledge all the sync
 		 * responses as quickly as possible so that waiting tasks do not
@@ -205,7 +223,6 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev)
 		}
 		crq->desc[crq->next_to_use].flag = 0;
 		hclge_mbx_ring_ptr_move_crq(crq);
-		flag = le16_to_cpu(crq->desc[crq->next_to_use].flag);
 	}
 
 	/* Write back CMDQ_RQ header pointer, M7 need this pointer */
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 3/3] net: hns3: Optimize PF CMDQ interrupt switching process
From: Salil Mehta @ 2018-06-05 11:42 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Xi Wang
In-Reply-To: <20180605114201.29900-1-salil.mehta@huawei.com>

From: Xi Wang <wangxi11@huawei.com>

When the PF frequently switches the CMDQ interrupt, if the CMDQ_SRC is
not cleared before the hardware interrupt is generated, the new interrupt
will not be reported.

This patch optimizes this problem by clearing CMDQ_SRC and RESET_STS
before enabling interrupt and syncing pending IRQ handlers after disabling
interrupt.

Fixes: 466b0c00391b ("net: hns3: Add support for misc interrupt")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 2a80134..d318d35 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2557,6 +2557,15 @@ static void hclge_clear_event_cause(struct hclge_dev *hdev, u32 event_type,
 	}
 }
 
+static void hclge_clear_all_event_cause(struct hclge_dev *hdev)
+{
+	hclge_clear_event_cause(hdev, HCLGE_VECTOR0_EVENT_RST,
+				BIT(HCLGE_VECTOR0_GLOBALRESET_INT_B) |
+				BIT(HCLGE_VECTOR0_CORERESET_INT_B) |
+				BIT(HCLGE_VECTOR0_IMPRESET_INT_B));
+	hclge_clear_event_cause(hdev, HCLGE_VECTOR0_EVENT_MBX, 0);
+}
+
 static void hclge_enable_vector(struct hclge_misc_vector *vector, bool enable)
 {
 	writel(enable ? 1 : 0, vector->addr);
@@ -5688,6 +5697,8 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 	INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
 	INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
 
+	hclge_clear_all_event_cause(hdev);
+
 	/* Enable MISC vector(vector0) */
 	hclge_enable_vector(&hdev->misc_vector, true);
 
@@ -5817,6 +5828,8 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev *ae_dev)
 
 	/* Disable MISC vector(vector0) */
 	hclge_enable_vector(&hdev->misc_vector, false);
+	synchronize_irq(hdev->misc_vector.vector_irq);
+
 	hclge_destroy_cmd_queue(&hdev->hw);
 	hclge_misc_irq_uninit(hdev);
 	hclge_pci_uninit(hdev);
-- 
2.7.4

^ permalink raw reply related

* [bpf-next PATCH 0/5] net/xdp: remove net_device operation ndo_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend

This patchset removes the net_device operation ndo_xdp_flush() call.
This is a follow merge commit ea9916ea3ed9 ("Merge branch
'ndo_xdp_xmit-cleanup'").  As after commit c1ece6b245bd ("bpf/xdp:
devmap can avoid calling ndo_xdp_flush") no callers of ndo_xdp_flush
are left in bpf-next tree.

---

Jesper Dangaard Brouer (5):
      i40e: remove ndo_xdp_flush call i40e_xdp_flush
      ixgbe: remove ndo_xdp_flush call ixgbe_xdp_flush
      virtio_net: remove ndo_xdp_flush call virtnet_xdp_flush
      tun: remove ndo_xdp_flush call tun_xdp_flush
      net: remove net_device operation ndo_xdp_flush


 drivers/net/ethernet/intel/i40e/i40e_main.c   |    1 -
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   19 -------------------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h   |    1 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   21 ---------------------
 drivers/net/tun.c                             |   23 +----------------------
 drivers/net/virtio_net.c                      |   13 -------------
 include/linux/netdevice.h                     |    4 ----
 7 files changed, 1 insertion(+), 81 deletions(-)

^ permalink raw reply

* [bpf-next PATCH 1/5] i40e: remove ndo_xdp_flush call i40e_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend
In-Reply-To: <152819969561.7083.15427306662397720502.stgit@firesoul>

Remove the ndo_xdp_flush call implementation i40e_xdp_flush
as no callers of ndo_xdp_flush are left.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c |    1 -
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |   19 -------------------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |    1 -
 3 files changed, 21 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b5daa5c9c7de..c944bd10b03d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11883,7 +11883,6 @@ static const struct net_device_ops i40e_netdev_ops = {
 	.ndo_bridge_setlink	= i40e_ndo_bridge_setlink,
 	.ndo_bpf		= i40e_xdp,
 	.ndo_xdp_xmit		= i40e_xdp_xmit,
-	.ndo_xdp_flush		= i40e_xdp_flush,
 };
 
 /**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 5f01e4ce9c92..713995d04783 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -3707,22 +3707,3 @@ int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames,
 
 	return n - drops;
 }
-
-/**
- * i40e_xdp_flush - Implements ndo_xdp_flush
- * @dev: netdev
- **/
-void i40e_xdp_flush(struct net_device *dev)
-{
-	struct i40e_netdev_priv *np = netdev_priv(dev);
-	unsigned int queue_index = smp_processor_id();
-	struct i40e_vsi *vsi = np->vsi;
-
-	if (test_bit(__I40E_VSI_DOWN, vsi->state))
-		return;
-
-	if (!i40e_enabled_xdp_vsi(vsi) || queue_index >= vsi->num_queue_pairs)
-		return;
-
-	i40e_xdp_ring_update_tail(vsi->xdp_rings[queue_index]);
-}
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 820f76db251b..bb04f6a731fe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -489,7 +489,6 @@ int __i40e_maybe_stop_tx(struct i40e_ring *tx_ring, int size);
 bool __i40e_chk_linearize(struct sk_buff *skb);
 int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames,
 		  u32 flags);
-void i40e_xdp_flush(struct net_device *dev);
 
 /**
  * i40e_get_head - Retrieve head from head writeback

^ permalink raw reply related

* [bpf-next PATCH 2/5] ixgbe: remove ndo_xdp_flush call ixgbe_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend
In-Reply-To: <152819969561.7083.15427306662397720502.stgit@firesoul>

Remove the ndo_xdp_flush call implementation ixgbe_xdp_flush
as no callers of ndo_xdp_flush are left.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   21 ---------------------
 1 file changed, 21 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 4fd77c9067f2..ef1afb3a8a97 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10069,26 +10069,6 @@ static int ixgbe_xdp_xmit(struct net_device *dev, int n,
 	return n - drops;
 }
 
-static void ixgbe_xdp_flush(struct net_device *dev)
-{
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ixgbe_ring *ring;
-
-	/* Its possible the device went down between xdp xmit and flush so
-	 * we need to ensure device is still up.
-	 */
-	if (unlikely(test_bit(__IXGBE_DOWN, &adapter->state)))
-		return;
-
-	ring = adapter->xdp_prog ? adapter->xdp_ring[smp_processor_id()] : NULL;
-	if (unlikely(!ring))
-		return;
-
-	ixgbe_xdp_ring_update_tail(ring);
-
-	return;
-}
-
 static const struct net_device_ops ixgbe_netdev_ops = {
 	.ndo_open		= ixgbe_open,
 	.ndo_stop		= ixgbe_close,
@@ -10136,7 +10116,6 @@ static const struct net_device_ops ixgbe_netdev_ops = {
 	.ndo_features_check	= ixgbe_features_check,
 	.ndo_bpf		= ixgbe_xdp,
 	.ndo_xdp_xmit		= ixgbe_xdp_xmit,
-	.ndo_xdp_flush		= ixgbe_xdp_flush,
 };
 
 /**

^ permalink raw reply related

* [bpf-next PATCH 3/5] virtio_net: remove ndo_xdp_flush call virtnet_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend
In-Reply-To: <152819969561.7083.15427306662397720502.stgit@firesoul>

Remove the ndo_xdp_flush call implementation virtnet_xdp_flush
as no callers of ndo_xdp_flush are left.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |   13 -------------
 1 file changed, 13 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 62ba8aadd8e6..8c5b59e79439 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -407,18 +407,6 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	return skb;
 }
 
-static void virtnet_xdp_flush(struct net_device *dev)
-{
-	struct virtnet_info *vi = netdev_priv(dev);
-	struct send_queue *sq;
-	unsigned int qp;
-
-	qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
-	sq = &vi->sq[qp];
-
-	virtqueue_kick(sq->vq);
-}
-
 static int __virtnet_xdp_xmit_one(struct virtnet_info *vi,
 				   struct send_queue *sq,
 				   struct xdp_frame *xdpf)
@@ -2359,7 +2347,6 @@ static const struct net_device_ops virtnet_netdev = {
 #endif
 	.ndo_bpf		= virtnet_xdp,
 	.ndo_xdp_xmit		= virtnet_xdp_xmit,
-	.ndo_xdp_flush		= virtnet_xdp_flush,
 	.ndo_features_check	= passthru_features_check,
 };
 

^ permalink raw reply related

* [bpf-next PATCH 4/5] tun: remove ndo_xdp_flush call tun_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend
In-Reply-To: <152819969561.7083.15427306662397720502.stgit@firesoul>

Remove the ndo_xdp_flush call implementation tun_xdp_flush
as no callers of ndo_xdp_flush are left.

The tun drivers XDP_TX implementation also used tun_xdp_flush (and
tun_xdp_xmit).  This is easily solved by passing the XDP_XMIT_FLUSH
flag to tun_xdp_xmit in tun_xdp_tx.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/tun.c |   23 +----------------------
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d82a05fb0594..ef09224496e8 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1347,26 +1347,7 @@ static int tun_xdp_tx(struct net_device *dev, struct xdp_buff *xdp)
 	if (unlikely(!frame))
 		return -EOVERFLOW;
 
-	return tun_xdp_xmit(dev, 1, &frame, 0);
-}
-
-static void tun_xdp_flush(struct net_device *dev)
-{
-	struct tun_struct *tun = netdev_priv(dev);
-	struct tun_file *tfile;
-	u32 numqueues;
-
-	rcu_read_lock();
-
-	numqueues = READ_ONCE(tun->numqueues);
-	if (!numqueues)
-		goto out;
-
-	tfile = rcu_dereference(tun->tfiles[smp_processor_id() %
-					    numqueues]);
-	__tun_xdp_flush_tfile(tfile);
-out:
-	rcu_read_unlock();
+	return tun_xdp_xmit(dev, 1, &frame, XDP_XMIT_FLUSH);
 }
 
 static const struct net_device_ops tap_netdev_ops = {
@@ -1387,7 +1368,6 @@ static const struct net_device_ops tap_netdev_ops = {
 	.ndo_get_stats64	= tun_net_get_stats64,
 	.ndo_bpf		= tun_xdp,
 	.ndo_xdp_xmit		= tun_xdp_xmit,
-	.ndo_xdp_flush		= tun_xdp_flush,
 };
 
 static void tun_flow_init(struct tun_struct *tun)
@@ -1706,7 +1686,6 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 			alloc_frag->offset += buflen;
 			if (tun_xdp_tx(tun->dev, &xdp))
 				goto err_redirect;
-			tun_xdp_flush(tun->dev);
 			rcu_read_unlock();
 			preempt_enable();
 			return NULL;

^ permalink raw reply related

* [bpf-next PATCH 5/5] net: remove net_device operation ndo_xdp_flush
From: Jesper Dangaard Brouer @ 2018-06-05 11:55 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: liu.song.a23, songliubraving, John Fastabend
In-Reply-To: <152819969561.7083.15427306662397720502.stgit@firesoul>

All drivers are cleaned up and no references to ndo_xdp_flush
are left in drivers, it is time to remove the net_device_ops
operation ndo_xdp_flush.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/netdevice.h |    4 ----
 1 file changed, 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7f17785a59d7..42c6ea35a6f2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1192,9 +1192,6 @@ struct dev_ifalias {
  *	that got dropped are freed/returned via xdp_return_frame().
  *	Returns negative number, means general error invoking ndo, meaning
  *	no frames were xmit'ed and core-caller will free all frames.
- * void (*ndo_xdp_flush)(struct net_device *dev);
- *	This function is used to inform the driver to flush a particular
- *	xdp tx queue. Must be called on same CPU as xdp_xmit.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1382,7 +1379,6 @@ struct net_device_ops {
 	int			(*ndo_xdp_xmit)(struct net_device *dev, int n,
 						struct xdp_frame **xdp,
 						u32 flags);
-	void			(*ndo_xdp_flush)(struct net_device *dev);
 };
 
 /**

^ permalink raw reply related

* [PATCH net-next 1/2] ipv4: replace ip_hdr() with skb->data for optimization
From: Yafang Shao @ 2018-06-05 12:04 UTC (permalink / raw)
  To: edumazet, davem; +Cc: netdev, inux-kernel, Yafang Shao

In ip receive path, when ip header hasn't been pulled yet, ip_hdr() and
skb->data are pointing to the same byte.

In ip output path, when ip header is just pushed, ip_hdr() and skb->data
are pointing to the same byte.

As ip_hdr() is more expensive than using skb->data, so replace ip_hdr()
with skb->data in these situations for optimization.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 net/ipv4/ip_input.c  | 8 ++++----
 net/ipv4/ip_output.c | 8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 7582713..7a03e8c 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -309,7 +309,7 @@ static inline bool ip_rcv_options(struct sk_buff *skb)
 
 static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const struct iphdr *iph = (const struct iphdr *)skb->data;
 	int (*edemux)(struct sk_buff *skb);
 	struct net_device *dev = skb->dev;
 	struct rtable *rt;
@@ -335,7 +335,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 			if (unlikely(err))
 				goto drop_error;
 			/* must reload iph, skb->head might have changed */
-			iph = ip_hdr(skb);
+			iph = (const struct iphdr *)skb->data;
 		}
 	}
 
@@ -433,7 +433,7 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
 		goto inhdr_error;
 
-	iph = ip_hdr(skb);
+	iph = (const struct iphdr *)skb->data;
 
 	/*
 	 *	RFC1122: 3.2.1.2 MUST silently discard any IP frame that fails the checksum.
@@ -459,7 +459,7 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	if (!pskb_may_pull(skb, iph->ihl*4))
 		goto inhdr_error;
 
-	iph = ip_hdr(skb);
+	iph = (const struct iphdr *)skb->data;
 
 	if (unlikely(ip_fast_csum((u8 *)iph, iph->ihl)))
 		goto csum_error;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index af5a830..f5014cd 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -96,7 +96,7 @@ void ip_send_check(struct iphdr *iph)
 
 int __ip_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-	struct iphdr *iph = ip_hdr(skb);
+	struct iphdr *iph = (struct iphdr *)skb->data;
 
 	iph->tot_len = htons(skb->len);
 	ip_send_check(iph);
@@ -151,7 +151,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, const struct sock *sk,
 	/* Build the IP header. */
 	skb_push(skb, sizeof(struct iphdr) + (opt ? opt->opt.optlen : 0));
 	skb_reset_network_header(skb);
-	iph = ip_hdr(skb);
+	iph = (struct iphdr *)skb->data;
 	iph->version  = 4;
 	iph->ihl      = 5;
 	iph->tos      = inet->tos;
@@ -477,7 +477,7 @@ int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl)
 	/* OK, we know where to send it, allocate and build IP header. */
 	skb_push(skb, sizeof(struct iphdr) + (inet_opt ? inet_opt->opt.optlen : 0));
 	skb_reset_network_header(skb);
-	iph = ip_hdr(skb);
+	iph = (struct iphdr *)skb->data;
 	*((__be16 *)iph) = htons((4 << 12) | (5 << 8) | (inet->tos & 0xff));
 	if (ip_dont_fragment(sk, &rt->dst) && !skb->ignore_df)
 		iph->frag_off = htons(IP_DF);
@@ -659,7 +659,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 				__skb_push(frag, hlen);
 				skb_reset_network_header(frag);
 				memcpy(skb_network_header(frag), iph, hlen);
-				iph = ip_hdr(frag);
+				iph = (struct iphdr *)skb->data;
 				iph->tot_len = htons(frag->len);
 				ip_copy_metadata(frag, skb);
 				if (offset == 0)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 2/2] ipv6: replace ip_hdr() with skb->data for optimization
From: Yafang Shao @ 2018-06-05 12:04 UTC (permalink / raw)
  To: edumazet, davem; +Cc: netdev, inux-kernel, Yafang Shao
In-Reply-To: <1528200262-11834-1-git-send-email-laoar.shao@gmail.com>

In ipv6 receive path, when ip header hasn't been pulled yet, ip_hdr()
and skb->data are pointing to the same byte.

In ipv6 output path, when ip header is just pushed, ip_hdr() and skb->data
are pointing to the same byte.

As ip_hdr() is more expensive than using skb->data, so replace ip_hdr()
with skb->data in these situations for optimization.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 net/ipv6/ip6_input.c  | 4 ++--
 net/ipv6/ip6_output.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index f08d344..2ff4fe8 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -113,7 +113,7 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt
 	if (unlikely(!pskb_may_pull(skb, sizeof(*hdr))))
 		goto err;
 
-	hdr = ipv6_hdr(skb);
+	hdr = (const struct ipv6hdr *)skb->data;
 
 	if (hdr->version != 6)
 		goto err;
@@ -189,7 +189,7 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt
 			__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
 			goto drop;
 		}
-		hdr = ipv6_hdr(skb);
+		hdr = (const struct ipv6hdr *)skb->data;
 	}
 
 	if (hdr->nexthdr == NEXTHDR_HOP) {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 021e5ae..8bb3bc1 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -235,7 +235,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
-	hdr = ipv6_hdr(skb);
+	hdr = (struct ipv6hdr *)skb->data;
 
 	/*
 	 *	Fill in the IPv6 header
@@ -1659,7 +1659,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
-	hdr = ipv6_hdr(skb);
+	hdr = (struct ipv6hdr *)skb->data;
 
 	ip6_flow_hdr(hdr, v6_cork->tclass,
 		     ip6_make_flowlabel(net, skb, fl6->flowlabel,
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: Paolo Abeni @ 2018-06-05 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Pablo Neira Ayuso, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI
  Cc: Máté Eckl, Willem de Bruijn, Eric Dumazet, David Ahern,
	Martin KaFai Lau, netdev, linux-kernel
In-Reply-To: <20180605114056.1239571-1-arnd@arndb.de>

On Tue, 2018-06-05 at 13:40 +0200, Arnd Bergmann wrote:
> It is now possible to enable the libified nf_tproxy modules without
> also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
> ifdef logic in the udp core code:
> 
> net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
> nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
> net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
> nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'
> 
> We can actually simplify the conditions now to provide the two functions
> exactly when they are needed.
> 
> Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  net/ipv4/udp.c | 4 +---
>  net/ipv6/udp.c | 4 +---
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 4f16e5d71875..3365362cac88 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -544,9 +544,7 @@ EXPORT_SYMBOL_GPL(udp4_lib_lookup_skb);
>  /* Must be called under rcu_read_lock().
>   * Does increment socket refcount.
>   */
> -#if IS_ENABLED(CONFIG_NETFILTER_XT_MATCH_SOCKET) || \
> -    IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TPROXY) || \
> -    IS_ENABLED(CONFIG_NF_SOCKET_IPV4)
> +#if IS_ENABLED(CONFIG_NF_TPROXY_IPV4) || IS_ENABLED(CONFIG_NF_SOCKET_IPV4)
>  struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
>  			     __be32 daddr, __be16 dport, int dif)
>  {
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 967acff95bbe..164afd31aebf 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -285,9 +285,7 @@ EXPORT_SYMBOL_GPL(udp6_lib_lookup_skb);
>  /* Must be called under rcu_read_lock().
>   * Does increment socket refcount.
>   */
> -#if IS_ENABLED(CONFIG_NETFILTER_XT_MATCH_SOCKET) || \
> -    IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TPROXY) || \
> -    IS_ENABLED(CONFIG_NF_SOCKET_IPV6)
> +#if IS_ENABLED(CONFIG_NF_TPROXY_IPV6) || IS_ENABLED(CONFIG_NF_SOCKET_IPV6)
>  struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be16 sport,
>  			     const struct in6_addr *daddr, __be16 dport, int dif)
>  {

LGTM,

Acked-by: Paolo Abeni <pabeni@redhat.com>

Thanks,

Paolo

^ permalink raw reply

* Re: [PATCH net-next 1/2] ipv4: replace ip_hdr() with skb->data for optimization
From: Paolo Abeni @ 2018-06-05 12:20 UTC (permalink / raw)
  To: Yafang Shao, edumazet, davem; +Cc: netdev, inux-kernel
In-Reply-To: <1528200262-11834-1-git-send-email-laoar.shao@gmail.com>

On Tue, 2018-06-05 at 08:04 -0400, Yafang Shao wrote:
> In ip receive path, when ip header hasn't been pulled yet, ip_hdr() and
> skb->data are pointing to the same byte.
> 
> In ip output path, when ip header is just pushed, ip_hdr() and skb->data
> are pointing to the same byte.
> 
> As ip_hdr() is more expensive than using skb->data, so replace ip_hdr()
> with skb->data in these situations for optimization.

IMHO this makes the code less readable and more error prone. Which kind
of performance improvement do you measure here?

Thanks,

Paolo

^ permalink raw reply

* Re: KASAN: slab-out-of-bounds Read in bpf_csum_update
From: Daniel Borkmann @ 2018-06-05 12:28 UTC (permalink / raw)
  To: Dmitry Vyukov, syzbot
  Cc: Alexei Starovoitov, David Miller, LKML, netdev, syzkaller-bugs
In-Reply-To: <CACT4Y+a7hZ0x16PnyJxrb7akBRNr-TDR8Cvn01G27HmAKVE_vg@mail.gmail.com>

On 06/04/2018 07:36 AM, Dmitry Vyukov wrote:
> On Mon, Jun 4, 2018 at 1:36 AM, syzbot
> <syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com> wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    0512e0134582 Merge tag 'xfs-4.17-fixes-3' of git://git.ker..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=17eb2d7b800000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=968b0b23c7854c0b
>> dashboard link: https://syzkaller.appspot.com/bug?extid=efae31b384d5badbd620
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=162c6def800000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14fe3db7800000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com
>>
>> random: sshd: uninitialized urandom read (32 bytes read)
>> random: sshd: uninitialized urandom read (32 bytes read)
>> random: sshd: uninitialized urandom read (32 bytes read)
>> random: sshd: uninitialized urandom read (32 bytes read)
>> ==================================================================
>> BUG: KASAN: slab-out-of-bounds in ____bpf_csum_update net/core/filter.c:1679
>> [inline]
>> BUG: KASAN: slab-out-of-bounds in bpf_csum_update+0xb4/0xc0
>> net/core/filter.c:1673
>> Read of size 1 at addr ffff8801d9235b50 by task syz-executor507/4513
>>
>> CPU: 0 PID: 4513 Comm: syz-executor507 Not tainted 4.17.0-rc7+ #78
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
>>  __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
>>  ____bpf_csum_update net/core/filter.c:1679 [inline]
>>  bpf_csum_update+0xb4/0xc0 net/core/filter.c:1673
> 
> /\/\/\/\/\
> 
> Are there any known bugs with unwind through bpf functions?

Looks like you don't have kallsyms export enabled, here's a syzkaller diff
to get jit images exposed, then it should work:

diff --git a/tools/create-image.sh b/tools/create-image.sh
index 9f82482..395a2a0 100755
--- a/tools/create-image.sh
+++ b/tools/create-image.sh
@@ -23,6 +23,7 @@ echo 'SELINUX=disabled' | sudo tee $DIR/etc/selinux/config
 echo "kernel.printk = 7 4 1 3" | sudo tee -a $DIR/etc/sysctl.conf
 echo 'debug.exception-trace = 0' | sudo tee -a $DIR/etc/sysctl.conf
 echo "net.core.bpf_jit_enable = 1" | sudo tee -a $DIR/etc/sysctl.conf
+echo "net.core.bpf_jit_kallsyms = 1" | sudo tee -a $DIR/etc/sysctl.conf
 echo "kernel.softlockup_all_cpu_backtrace = 1" | sudo tee -a $DIR/etc/sysctl.conf
 echo "kernel.kptr_restrict = 0" | sudo tee -a $DIR/etc/sysctl.conf
 echo "kernel.watchdog_thresh = 60" | sudo tee -a $DIR/etc/sysctl.conf

Cheers,
Daniel

^ permalink raw reply related

* [PATCH net-next 0/6] use pci_zalloc_consistent
From: YueHaibing @ 2018-06-05 12:28 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, jcliburn, chris.snook, benve, jdmason,
	chessman, jes, rahul.verma, YueHaibing


YueHaibing (6):
  net: hippi: use pci_zalloc_consistent
  net: atheros: use pci_zalloc_consistent
  net: neterion: use pci_zalloc_consistent
  netxen_nic: use pci_zalloc_consistent
  net: tlan: use pci_zalloc_consistent
  enic: use pci_zalloc_consistent

 drivers/net/ethernet/atheros/atl1e/atl1e_main.c    |  2 +-
 drivers/net/ethernet/atheros/atlx/atl1.c           |  8 +++----
 drivers/net/ethernet/atheros/atlx/atl2.c           |  5 ++--
 drivers/net/ethernet/cisco/enic/vnic_dev.c         |  3 +--
 drivers/net/ethernet/neterion/s2io.c               | 10 ++++----
 .../net/ethernet/qlogic/netxen/netxen_nic_ctx.c    | 26 ++++++++------------
 drivers/net/ethernet/ti/tlan.c                     |  7 +++---
 drivers/net/hippi/rrunner.c                        | 28 +++++++++-------------
 8 files changed, 35 insertions(+), 54 deletions(-)

-- 
2.7.0

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox