Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 2/2] rtw88: pci: Use DMA sync instead of remapping in RX ISR
From: Christoph Hellwig @ 2019-07-09 16:15 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Yan-Hsuan Chuang, Kalle Valo, David S . Miller, Larry Finger,
	David Laight, linux-wireless, netdev, linux-kernel, linux,
	Daniel Drake
In-Reply-To: <20190709102059.7036-2-jian-hong@endlessm.com>

On Tue, Jul 09, 2019 at 06:21:01PM +0800, Jian-Hong Pan wrote:
> Since each skb in RX ring is reused instead of new allocation, we can
> treat the DMA in a more efficient way by DMA synchronization.
> 
> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> ---
>  drivers/net/wireless/realtek/rtw88/pci.c | 35 ++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
> index e9fe3ad896c8..28ca76f71dfe 100644
> --- a/drivers/net/wireless/realtek/rtw88/pci.c
> +++ b/drivers/net/wireless/realtek/rtw88/pci.c
> @@ -206,6 +206,35 @@ static int rtw_pci_reset_rx_desc(struct rtw_dev *rtwdev, struct sk_buff *skb,
>  	return 0;
>  }
>  
> +static int rtw_pci_sync_rx_desc_cpu(struct rtw_dev *rtwdev, dma_addr_t dma)
> +{
> +	struct device *dev = rtwdev->dev;
> +	int buf_sz = RTK_PCI_RX_BUF_SIZE;
> +
> +	dma_sync_single_for_cpu(dev, dma, buf_sz, PCI_DMA_FROMDEVICE);
> +
> +	return 0;
> +}

No need to return a value from this helper. In fact I'm not even sure
you need the helper at all.  Also please use the DMA_FROM_DEVICE
constant instead of the deprecated PCI variant.

> +static int rtw_pci_sync_rx_desc_device(struct rtw_dev *rtwdev, dma_addr_t dma,
> +				       struct rtw_pci_rx_ring *rx_ring,
> +				       u32 idx, u32 desc_sz)
> +{
> +	struct device *dev = rtwdev->dev;
> +	struct rtw_pci_rx_buffer_desc *buf_desc;
> +	int buf_sz = RTK_PCI_RX_BUF_SIZE;
> +
> +	dma_sync_single_for_device(dev, dma, buf_sz, PCI_DMA_FROMDEVICE);
> +
> +	buf_desc = (struct rtw_pci_rx_buffer_desc *)(rx_ring->r.head +
> +						     idx * desc_sz);
> +	memset(buf_desc, 0, sizeof(*buf_desc));
> +	buf_desc->buf_size = cpu_to_le16(RTK_PCI_RX_BUF_SIZE);
> +	buf_desc->dma = cpu_to_le32(dma);
> +
> +	return 0;
> +}

Same comment on the PCI constant and the return value here.

^ permalink raw reply

* Re: [PATCH] crypto: user - make NETLINK_CRYPTO work inside netns
From: Herbert Xu @ 2019-07-09 16:14 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: linux-crypto, netdev, David S . Miller, Stephan Mueller,
	Steffen Klassert, Don Zickus
In-Reply-To: <CAFqZXNs2XysEWVzmfXSczH-+oX5iwwRC3+9fL3tWYEfDRbqLig@mail.gmail.com>

On Tue, Jul 09, 2019 at 05:28:35PM +0200, Ondrej Mosnacek wrote:
>
> I admit I'm not an expert on Linux namespaces, but aren't you
> confusing network and user namespaces? Unless I'm mistaken, these
> changes only affect _network_ namespaces (which only isolate the
> network stuff itself) and the semantics of the netlink_capable(skb,
> CAP_NET_ADMIN) calls remain unchanged - they check if the opener of
> the socket has the CAP_NET_ADMIN capability within the global _user_
> namespace.

Good point.  I think your patch should be OK then.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH v3 bpf-next 0/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Stanislav Fomichev @ 2019-07-09 16:07 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: bpf, netdev, ys114321, davem, ast, daniel
In-Reply-To: <20190709151809.37539-1-iii@linux.ibm.com>

On 07/09, Ilya Leoshkevich wrote:
> Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.
> 
> This patch series consists of three preparatory commits, which make it
> possible to use PT_REGS_RC in BPF selftests, followed by the actual fix.
> 
> Since the last time, I've tested it with x86_64-linux-gnu-,
> aarch64-linux-gnu-, arm-linux-gnueabihf-, mips64el-linux-gnuabi64-,
> powerpc64le-linux-gnu-, s390x-linux-gnu- and sparc64-linux-gnu-
> compilers, and found that I also need to add arm64 support.
> 
> Like s390, arm64 exports user_pt_regs instead of struct pt_regs to
> userspace.
> 
> I've also made fixes for a few unrelated build problems, which I will
> post separately.
> 
> v1->v2: Split into multiple patches.
> v2->v3: Added arm64 support.
For the whole series:

Reviewed-by: Stanislav Fomichev <sdf@google.com>

This should probably go to bpf, not bpf-next since it fixes the
existing compilation problem.

> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> 
> 

^ permalink raw reply

* [PATCH iproute2-next 3/3] man: update man pages for TC MPLS actions
From: John Hurley @ 2019-07-09 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jiri, xiyou.wangcong, dsahern, willemdebruijn.kernel,
	simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1562687972-23549-1-git-send-email-john.hurley@netronome.com>

Add a man page describing the newly added TC mpls manipulation actions.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 man/man8/tc-mpls.8 | 156 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 man/man8/tc-mpls.8

diff --git a/man/man8/tc-mpls.8 b/man/man8/tc-mpls.8
new file mode 100644
index 0000000..84ef2ef
--- /dev/null
+++ b/man/man8/tc-mpls.8
@@ -0,0 +1,156 @@
+.TH "MPLS manipulation action in tc" 8 "22 May 2019" "iproute2" "Linux"
+
+.SH NAME
+mpls - mpls manipulation module
+.SH SYNOPSIS
+.in +8
+.ti -8
+.BR tc " ... " "action mpls" " { "
+.IR POP " | " PUSH " | " MODIFY " | "
+.BR dec_ttl " } [ "
+.IR CONTROL " ]"
+
+.ti -8
+.IR POP " := "
+.BR pop " " protocol
+.IR MPLS_PROTO
+
+.ti -8
+.IR PUSH " := "
+.BR push " [ " protocol
+.IR MPLS_PROTO " ]"
+.RB " [ " tc
+.IR MPLS_TC " ] "
+.RB " [ " ttl
+.IR MPLS_TTL " ] "
+.RB " [ " bos
+.IR MPLS_BOS " ] "
+.BI label " MPLS_LABEL"
+
+.ti -8
+.IR MODIFY " := "
+.BR modify " [ " label
+.IR MPLS_LABEL " ]"
+.RB " [ " tc
+.IR MPLS_TC " ] "
+.RB " [ " ttl
+.IR MPLS_TTL " ] "
+
+.ti -8
+.IR CONTROL " := { "
+.BR reclassify " | " pipe " | " drop " | " continue " | " pass " | " goto " " chain " " CHAIN_INDEX " }"
+.SH DESCRIPTION
+The
+.B mpls
+action performs mpls encapsulation or decapsulation on a packet, reflected by the
+operation modes
+.IR POP ", " PUSH ", " MODIFY " and " DEC_TTL .
+The
+.I POP
+mode requires the ethertype of the header that follows the MPLS header (e.g.
+IPv4 or another MPLS). It will remove the outer MPLS header and replace the
+ethertype in the MAC header with that passed. The
+.IR PUSH " and " MODIFY
+modes update the current MPLS header information or add a new header.
+.IR PUSH
+requires at least an
+.IR MPLS_LABEL ". "
+.I DEC_TTL
+requires no arguments and simply subtracts 1 from the MPLS header TTL field.
+
+.SH OPTIONS
+.TP
+.B pop
+Decapsulation mode. Requires the protocol of the next header.
+.TP
+.B push
+Encapsulation mode. Requires at least the
+.B label
+option.
+.TP
+.B modify
+Replace mode. Existing MPLS tag is replaced.
+.BR label ", "
+.BR tc ", "
+and
+.B ttl
+are all optional.
+.TP
+.B dec_ttl
+Decrement the TTL field on the outer most MPLS header.
+.TP
+.BI label " MPLS_LABEL"
+Specify the MPLS LABEL for the outer MPLS header.
+.I MPLS_LABEL
+is an unsigned 20bit integer, the format is detected automatically (e.g. prefix
+with
+.RB ' 0x '
+for hexadecimal interpretation, etc.).
+.TP
+.BI protocol " MPLS_PROTO"
+Choose the protocol to use. For push actions this must be
+.BR mpls_uc " or " mpls_mc " (" mpls_uc
+is the default). For pop actions it should be the protocol of the next header.
+This option cannot be used with modify.
+.TP
+.BI tc " MPLS_TC"
+Choose the TC value for the outer MPLS header. Decimal number in range of 0-7.
+Defaults to 0.
+.TP
+.BI ttl " MPLS_TTL"
+Choose the TTL value for the outer MPLS header. Number in range of 0-255. A
+non-zero default value will be selected if this is not explicitly set.
+.TP
+.BI bos " MPLS_BOS"
+Manually configure the bottom of stack bit for an MPLS header push. The default
+is for TC to automatically set (or unset) the bit based on the next header of
+the packet.
+.TP
+.I CONTROL
+How to continue after executing this action.
+.RS
+.TP
+.B reclassify
+Restarts classification by jumping back to the first filter attached to this
+action's parent.
+.TP
+.B pipe
+Continue with the next action, this is the default.
+.TP
+.B drop
+Packet will be dropped without running further actions.
+.TP
+.B continue
+Continue classification with next filter in line.
+.TP
+.B pass
+Return to calling qdisc for packet processing. This ends the classification
+process.
+.RE
+.SH EXAMPLES
+The following example encapsulates incoming IP packets on eth0 into MPLS with
+a label 123 and sends them out eth1:
+
+.RS
+.EX
+#tc qdisc add dev eth0 handle ffff: ingress
+#tc filter add dev eth0 protocol ip parent ffff: flower \\
+	action mpls push protocol mpls_uc label 123  \\
+	action mirred egress redirect dev eth1
+.EE
+.RE
+
+In this example, incoming MPLS unicast packets on eth0 are decapsulated and to
+ip packets and output to eth1:
+
+.RS
+.EX
+#tc qdisc add dev eth0 handle ffff: ingress
+#tc filter add dev eth0 protocol mpls_uc parent ffff: flower \\
+	action mpls pop protocol ipv4  \\
+	action mirred egress redirect dev eth0
+.EE
+.RE
+
+.SH SEE ALSO
+.BR tc (8)
-- 
2.7.4


^ permalink raw reply related

* [PATCH iproute2-next 2/3] tc: add mpls actions
From: John Hurley @ 2019-07-09 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jiri, xiyou.wangcong, dsahern, willemdebruijn.kernel,
	simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1562687972-23549-1-git-send-email-john.hurley@netronome.com>

Create a new action type for TC that allows the pushing, popping, and
modifying of MPLS headers.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 tc/Makefile |   1 +
 tc/m_mpls.c | 275 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 276 insertions(+)
 create mode 100644 tc/m_mpls.c

diff --git a/tc/Makefile b/tc/Makefile
index 60abdde..09ff369 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -39,6 +39,7 @@ TCMODULES += q_drr.o
 TCMODULES += q_qfq.o
 TCMODULES += m_gact.o
 TCMODULES += m_mirred.o
+TCMODULES += m_mpls.o
 TCMODULES += m_nat.o
 TCMODULES += m_pedit.o
 TCMODULES += m_ife.o
diff --git a/tc/m_mpls.c b/tc/m_mpls.c
new file mode 100644
index 0000000..d2700ec
--- /dev/null
+++ b/tc/m_mpls.c
@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/* Copyright (C) 2019 Netronome Systems, Inc. */
+
+#include <linux/if_ether.h>
+#include <linux/tc_act/tc_mpls.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "utils.h"
+#include "rt_names.h"
+#include "tc_util.h"
+
+static const char * const action_names[] = {
+	[TCA_MPLS_ACT_POP] = "pop",
+	[TCA_MPLS_ACT_PUSH] = "push",
+	[TCA_MPLS_ACT_MODIFY] = "modify",
+	[TCA_MPLS_ACT_DEC_TTL] = "dec_ttl",
+};
+
+static void explain(void)
+{
+	fprintf(stderr,
+		"Usage: mpls pop [ protocol MPLS_PROTO ]\n"
+		"       mpls push [ protocol MPLS_PROTO ] [ label MPLS_LABEL ] [ tc MPLS_TC ] [ ttl MPLS_TTL ] [ bos MPLS_BOS ] [CONTROL]\n"
+		"       mpls modify [ label MPLS_LABEL ] [ tc MPLS_TC ] [ ttl MPLS_TTL ] [CONTROL]\n"
+		"	for pop MPLS_PROTO is next header of packet - e.g. ip or mpls_uc\n"
+		"       for push MPLS_PROTO is one of mpls_uc or mpls_mc\n"
+		"            with default: mpls_uc\n"
+		"       CONTROL := reclassify | pipe | drop | continue | pass |\n"
+		"                  goto chain <CHAIN_INDEX>\n");
+}
+
+static void usage(void)
+{
+	explain();
+	exit(-1);
+}
+
+static bool can_modify_mpls_fields(unsigned int action)
+{
+	return action == TCA_MPLS_ACT_PUSH || action == TCA_MPLS_ACT_MODIFY;
+}
+
+static bool can_modify_ethtype(unsigned int action)
+{
+	return action == TCA_MPLS_ACT_PUSH || action == TCA_MPLS_ACT_POP;
+}
+
+static bool is_valid_label(__u32 label)
+{
+	return label <= 0xfffff;
+}
+
+static bool check_double_action(unsigned int action, const char *arg)
+{
+	if (!action)
+		return false;
+
+	fprintf(stderr,
+		"Error: got \"%s\" but action already set to \"%s\"\n",
+		arg, action_names[action]);
+	explain();
+	return true;
+}
+
+static int parse_mpls(struct action_util *a, int *argc_p, char ***argv_p,
+		      int tca_id, struct nlmsghdr *n)
+{
+	struct tc_mpls parm = {};
+	__u32 label = 0xffffffff;
+	unsigned int action = 0;
+	char **argv = *argv_p;
+	struct rtattr *tail;
+	int argc = *argc_p;
+	__u16 proto = 0;
+	__u8 bos = 0xff;
+	__u8 tc = 0xff;
+	__u8 ttl = 0;
+
+	if (matches(*argv, "mpls") != 0)
+		return -1;
+
+	NEXT_ARG();
+
+	while (argc > 0) {
+		if (matches(*argv, "pop") == 0) {
+			if (check_double_action(action, *argv))
+				return -1;
+			action = TCA_MPLS_ACT_POP;
+		} else if (matches(*argv, "push") == 0) {
+			if (check_double_action(action, *argv))
+				return -1;
+			action = TCA_MPLS_ACT_PUSH;
+		} else if (matches(*argv, "modify") == 0) {
+			if (check_double_action(action, *argv))
+				return -1;
+			action = TCA_MPLS_ACT_MODIFY;
+		} else if (matches(*argv, "dec_ttl") == 0) {
+			if (check_double_action(action, *argv))
+				return -1;
+			action = TCA_MPLS_ACT_DEC_TTL;
+		} else if (matches(*argv, "label") == 0) {
+			if (!can_modify_mpls_fields(action))
+				invarg("only valid for push/modify", *argv);
+			NEXT_ARG();
+			if (get_u32(&label, *argv, 0) || !is_valid_label(label))
+				invarg("label must be <=0xFFFFF", *argv);
+		} else if (matches(*argv, "tc") == 0) {
+			if (!can_modify_mpls_fields(action))
+				invarg("only valid for push/modify", *argv);
+			NEXT_ARG();
+			if (get_u8(&tc, *argv, 0) || (tc & ~0x7))
+				invarg("tc field is 3 bits max", *argv);
+		} else if (matches(*argv, "ttl") == 0) {
+			if (!can_modify_mpls_fields(action))
+				invarg("only valid for push/modify", *argv);
+			NEXT_ARG();
+			if (get_u8(&ttl, *argv, 0) || !ttl)
+				invarg("ttl must be >0 and <=255", *argv);
+		} else if (matches(*argv, "bos") == 0) {
+			if (!can_modify_mpls_fields(action))
+				invarg("only valid for push/modify", *argv);
+			NEXT_ARG();
+			if (get_u8(&bos, *argv, 0) || (bos & ~0x1))
+				invarg("bos must be 0 or 1", *argv);
+		} else if (matches(*argv, "protocol") == 0) {
+			if (!can_modify_ethtype(action))
+				invarg("only valid for push/pop", *argv);
+			NEXT_ARG();
+			if (ll_proto_a2n(&proto, *argv))
+				invarg("protocol is invalid", *argv);
+		} else if (matches(*argv, "help") == 0) {
+			usage();
+		} else {
+			break;
+		}
+
+		NEXT_ARG_FWD();
+	}
+
+	if (!action)
+		incomplete_command();
+
+	parse_action_control_dflt(&argc, &argv, &parm.action,
+				  false, TC_ACT_PIPE);
+
+	if (argc) {
+		if (matches(*argv, "index") == 0) {
+			NEXT_ARG();
+			if (get_u32(&parm.index, *argv, 10))
+				invarg("illegal index", *argv);
+			NEXT_ARG_FWD();
+		}
+	}
+
+	if (action == TCA_MPLS_ACT_PUSH && !label)
+		missarg("label");
+
+	if (action == TCA_MPLS_ACT_PUSH && proto &&
+	    proto != htons(ETH_P_MPLS_UC) && proto != htons(ETH_P_MPLS_MC)) {
+		fprintf(stderr,
+			"invalid push protocol \"0x%04x\" - use mpls_(uc|mc)\n",
+			ntohs(proto));
+		return -1;
+	}
+
+	if (action == TCA_MPLS_ACT_POP && !proto)
+		missarg("protocol");
+
+	parm.m_action = action;
+	tail = addattr_nest(n, MAX_MSG, tca_id | NLA_F_NESTED);
+	addattr_l(n, MAX_MSG, TCA_MPLS_PARMS, &parm, sizeof(parm));
+	if (label != 0xffffffff)
+		addattr_l(n, MAX_MSG, TCA_MPLS_LABEL, &label, sizeof(label));
+	if (proto)
+		addattr_l(n, MAX_MSG, TCA_MPLS_PROTO, &proto, sizeof(proto));
+	if (tc != 0xff)
+		addattr8(n, MAX_MSG, TCA_MPLS_TC, tc);
+	if (ttl)
+		addattr8(n, MAX_MSG, TCA_MPLS_TTL, ttl);
+	if (bos != 0xff)
+		addattr8(n, MAX_MSG, TCA_MPLS_BOS, bos);
+	addattr_nest_end(n, tail);
+
+	*argc_p = argc;
+	*argv_p = argv;
+	return 0;
+}
+
+static int print_mpls(struct action_util *au, FILE *f, struct rtattr *arg)
+{
+	struct rtattr *tb[TCA_MPLS_MAX + 1];
+	struct tc_mpls *parm;
+	SPRINT_BUF(b1);
+	__u32 val;
+
+	if (!arg)
+		return -1;
+
+	parse_rtattr_nested(tb, TCA_MPLS_MAX, arg);
+
+	if (!tb[TCA_MPLS_PARMS]) {
+		print_string(PRINT_FP, NULL, "%s", "[NULL mpls parameters]");
+		return -1;
+	}
+	parm = RTA_DATA(tb[TCA_MPLS_PARMS]);
+
+	print_string(PRINT_ANY, "kind", "%s ", "mpls");
+	print_string(PRINT_ANY, "mpls_action", " %s",
+		     action_names[parm->m_action]);
+
+	switch (parm->m_action) {
+	case TCA_MPLS_ACT_POP:
+		if (tb[TCA_MPLS_PROTO]) {
+			__u16 proto;
+
+			proto = rta_getattr_u16(tb[TCA_MPLS_PROTO]);
+			print_string(PRINT_ANY, "protocol", " protocol %s",
+				     ll_proto_n2a(proto, b1, sizeof(b1)));
+		}
+		break;
+	case TCA_MPLS_ACT_PUSH:
+		if (tb[TCA_MPLS_PROTO]) {
+			__u16 proto;
+
+			proto = rta_getattr_u16(tb[TCA_MPLS_PROTO]);
+			print_string(PRINT_ANY, "protocol", " protocol %s",
+				     ll_proto_n2a(proto, b1, sizeof(b1)));
+		}
+		/* Fallthrough */
+	case TCA_MPLS_ACT_MODIFY:
+		if (tb[TCA_MPLS_LABEL]) {
+			val = rta_getattr_u32(tb[TCA_MPLS_LABEL]);
+			print_uint(PRINT_ANY, "label", " label %u", val);
+		}
+		if (tb[TCA_MPLS_TC]) {
+			val = rta_getattr_u8(tb[TCA_MPLS_TC]);
+			print_uint(PRINT_ANY, "tc", " tc %u", val);
+		}
+		if (tb[TCA_MPLS_BOS]) {
+			val = rta_getattr_u8(tb[TCA_MPLS_BOS]);
+			print_uint(PRINT_ANY, "bos", " bos %u", val);
+		}
+		if (tb[TCA_MPLS_TTL]) {
+			val = rta_getattr_u8(tb[TCA_MPLS_TTL]);
+			print_uint(PRINT_ANY, "ttl", " ttl %u", val);
+		}
+		break;
+	}
+	print_action_control(f, " ", parm->action, "");
+
+	print_uint(PRINT_ANY, "index", "\n\t index %u", parm->index);
+	print_int(PRINT_ANY, "ref", " ref %d", parm->refcnt);
+	print_int(PRINT_ANY, "bind", " bind %d", parm->bindcnt);
+
+	if (show_stats) {
+		if (tb[TCA_MPLS_TM]) {
+			struct tcf_t *tm = RTA_DATA(tb[TCA_MPLS_TM]);
+
+			print_tm(f, tm);
+		}
+	}
+
+	print_string(PRINT_FP, NULL, "%s", "\n");
+
+	return 0;
+}
+
+struct action_util mpls_action_util = {
+	.id = "mpls",
+	.parse_aopt = parse_mpls,
+	.print_aopt = print_mpls,
+};
-- 
2.7.4


^ permalink raw reply related

* [PATCH iproute2-next 1/3] lib: add mpls_uc and mpls_mc as link layer protocol names
From: John Hurley @ 2019-07-09 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jiri, xiyou.wangcong, dsahern, willemdebruijn.kernel,
	simon.horman, jakub.kicinski, oss-drivers, John Hurley
In-Reply-To: <1562687972-23549-1-git-send-email-john.hurley@netronome.com>

Update the llproto_names array to allow users to reference the mpls
protocol ids with the names 'mpls_uc' for unicast MPLS and 'mpls_mc' for
multicast.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 lib/ll_proto.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/ll_proto.c b/lib/ll_proto.c
index 78c3961..2a0c1cb 100644
--- a/lib/ll_proto.c
+++ b/lib/ll_proto.c
@@ -78,6 +78,8 @@ __PF(TIPC,tipc)
 __PF(AOE,aoe)
 __PF(8021Q,802.1Q)
 __PF(8021AD,802.1ad)
+__PF(MPLS_UC,mpls_uc)
+__PF(MPLS_MC,mpls_mc)
 
 { 0x8100, "802.1Q" },
 { 0x88cc, "LLDP" },
-- 
2.7.4


^ permalink raw reply related

* [PATCH iproute2-next 0/3] add interface to TC MPLS actions
From: John Hurley @ 2019-07-09 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jiri, xiyou.wangcong, dsahern, willemdebruijn.kernel,
	simon.horman, jakub.kicinski, oss-drivers, John Hurley

Recent kernel additions to TC allows the manipulation of MPLS headers as
filter actions.

The following patchset creates an iproute2 interface to the new actions
and includes documentation on how to use it.

John Hurley (3):
  lib: add mpls_uc and mpls_mc as link layer protocol names
  tc: add mpls actions
  man: update man pages for TC MPLS actions

 lib/ll_proto.c     |   2 +
 man/man8/tc-mpls.8 | 156 ++++++++++++++++++++++++++++++
 tc/Makefile        |   1 +
 tc/m_mpls.c        | 275 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 434 insertions(+)
 create mode 100644 man/man8/tc-mpls.8
 create mode 100644 tc/m_mpls.c

-- 
2.7.4


^ permalink raw reply

* Re: [bpf PATCH v2 0/6] bpf: sockmap/tls fixes
From: John Fastabend @ 2019-07-09 15:40 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190708231318.1a721ce8@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Mon, 08 Jul 2019 19:13:29 +0000, John Fastabend wrote:
> > Resolve a series of splats discovered by syzbot and an unhash
> > TLS issue noted by Eric Dumazet.
> > 
> > The main issues revolved around interaction between TLS and
> > sockmap tear down. TLS and sockmap could both reset sk->prot
> > ops creating a condition where a close or unhash op could be
> > called forever. A rare race condition resulting from a missing
> > rcu sync operation was causing a use after free. Then on the
> > TLS side dropping the sock lock and re-acquiring it during the
> > close op could hang. Finally, sockmap must be deployed before
> > tls for current stack assumptions to be met. This is enforced
> > now. A feature series can enable it.
> > 
> > To fix this first refactor TLS code so the lock is held for the
> > entire teardown operation. Then add an unhash callback to ensure
> > TLS can not transition from ESTABLISHED to LISTEN state. This
> > transition is a similar bug to the one found and fixed previously
> > in sockmap. Then apply three fixes to sockmap to fix up races
> > on tear down around map free and close. Finally, if sockmap
> > is destroyed before TLS we add a new ULP op update to inform
> > the TLS stack it should not call sockmap ops. This last one
> > appears to be the most commonly found issue from syzbot.
> 
> Looks like strparser is not done'd for offload?

Right so if rx_conf != TLS_SW then the hardware needs to do
the strparser functionality.

> 
> About patch 6 - I was recently wondering about the "impossible" syzbot
> report where context is not freed and my conclusion was that there
> can be someone sitting at lock_sock() in tcp_close() already by the
> time we start installing the ULP, so TLS's close will never get called.
> The entire replacing of callbacks business is really shaky :(

Well replacing callbacks is the ULP model. The race we are fixing in
patch 6 is sockmap being free'd which removes psock and resets proto ops
with tcp_close() path.

I don't think there is another race like you describe because tcp_set_ulp
is called from do_tcp_setsockopt which holds the lock and tcp state is
checked to ensure its ESTABLISHED. A closing sock wont be in ESTABLISHED
state so any setup will be aborted. Before patch 1 though I definately
saw this race because we dropped the lock mid-close.

With this series I've been running those syzbot programs over night
without issue on 4 cores. Also selftests pass in ./net/tls and ./bpf/
so I think its stable and resolves many of the issues syzbot has been
stomping around.

> 
> Perhaps I'm rumbling, I will take a close look after I get some sleep :)

Yes please do ;)

^ permalink raw reply

* Re: [PATCH net-next iproute2 2/3] tc: Introduce tc ct action
From: Marcelo Ricardo Leitner @ 2019-07-09 15:36 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Oz Shlomo,
	netdev@vger.kernel.org, David Miller, Aaron Conole, Zhike Wang,
	Justin Pettit, John Hurley, Rony Efraim, nst-kernel@redhat.com,
	Simon Horman
In-Reply-To: <d4f2f3ce-f14d-6026-a271-d627de6d8cea@mellanox.com>

On Tue, Jul 09, 2019 at 06:58:36AM +0000, Paul Blakey wrote:
> 
> On 7/8/2019 8:54 PM, Marcelo Ricardo Leitner wrote:
> > On Sun, Jul 07, 2019 at 11:53:47AM +0300, Paul Blakey wrote:
> >> New tc action to send packets to conntrack module, commit
> >> them, and set a zone, labels, mark, and nat on the connection.
> >>
> >> It can also clear the packet's conntrack state by using clear.
> >>
> >> Usage:
> >>     ct clear
> >>     ct commit [force] [zone] [mark] [label] [nat]
> > Isn't the 'commit' also optional? More like
> >      ct [commit [force]] [zone] [mark] [label] [nat]
> >
> >>     ct [nat] [zone]
> >>
> >> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> >> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
> >> Acked-by: Jiri Pirko <jiri@mellanox.com>
> >> Acked-by: Roi Dayan <roid@mellanox.com>
> >> ---
> > ...
> >> +static void
> >> +usage(void)
> >> +{
> >> +	fprintf(stderr,
> >> +		"Usage: ct clear\n"
> >> +		"	ct commit [force] [zone ZONE] [mark MASKED_MARK] [label MASKED_LABEL] [nat NAT_SPEC]\n"
> > Ditto here then.
> 
> 
> In commit msg and here, it means there is multiple modes of operation. I 
> think it's easier to split those.

Yep, that is good.
More below.

> 
> "ct clear" to clear it , not other options can be added here.
> 
> "ct commit  [force].... " sends to conntrack and commit a connection, 
> and only for commit can you specify force mark  label, and nat with 
> nat_spec....
> 
> and the last one, "ct [nat] [zone ZONE]" is to just send the packet to 
> conntrack on some zone [optional], restore nat [optional].
> 
> 
> >
> >> +		"	ct [nat] [zone ZONE]\n"
> >> +		"Where: ZONE is the conntrack zone table number\n"
> >> +		"	NAT_SPEC is {src|dst} addr addr1[-addr2] [port port1[-port2]]\n"
> >> +		"\n");
> >> +	exit(-1);
> >> +}
> > ...
> >
> > The validation below doesn't enforce that commit must be there for
> > such case.
> which case? commit is optional. the above are the three valid patterns.

That's the point. But the 2nd example is saying 'commit' word is
mandatory in that mode. It is written as it is a command that was
selected.

One may use just:
    ct [zone]
And not
    ct commit [zone]
Right?


^ permalink raw reply

* Re: [PATCH net-next 15/16] net/mlx5e: RX, Handle CQE with error at the earliest stage
From: Jiri Pirko @ 2019-07-09 15:34 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: David S. Miller, netdev, Eran Ben Elisha, ayal, jiri,
	Saeed Mahameed, moshe
In-Reply-To: <1562500388-16847-16-git-send-email-tariqt@mellanox.com>

Sun, Jul 07, 2019 at 01:53:07PM CEST, tariqt@mellanox.com wrote:
>From: Saeed Mahameed <saeedm@mellanox.com>
>
>Just to be aligned with the MPWQE handlers, handle RX WQE with error
>for legacy RQs in the top RX handlers, just before calling skb_from_cqe().
>
>CQE error handling will now be called at the same stage regardless of
>the RQ type or netdev mode NIC, Representor, IPoIB, etc ..
>
>This will be useful for down stream patches to improve error CQE

I see only one patch left in this set.

^ permalink raw reply

* Re: [PATCH bpf-next RFC v3 2/6] bpf: add BPF_MAP_DUMP command to dump more than one entry per call
From: Brian Vazquez @ 2019-07-09 15:34 UTC (permalink / raw)
  To: Y Song
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, LKML, netdev, bpf
In-Reply-To: <CAH3MdRU505Er44m460c7y5nxtZxmDmVY4jDrWOYt2=OdP2d5Ow@mail.gmail.com>

> Maybe you can swap map_fd and flags?
> This way, you won't have hole right after map_fd?

Makes sense.

> > +       attr->flags = 0;
> Why do you want attr->flags? This is to modify anonumous struct used by
> BPF_MAP_*_ELEM commands.

Nice catch! This was a mistake I forgot to delete that line.

> In bcc, we have use cases like this. At a certain time interval (e.g.,
> every 2 seconds),
> we get all key/value pairs for a map, we format and print out map
> key/values on the screen,
> and then delete all key/value pairs we retrieved earlier.
>
> Currently, bpf_get_next_key() is used to get all key/value pairs, and
> deletion also happened
> at each key level.
>
> Your batch dump command should help retrieving map key/value pairs.
> What do you think deletions of those just retrieved map entries?
> With an additional flag and fold into BPF_MAP_DUMP?
> or implement a new BPF_MAP_DUMP_AND_DELETE?
>
> I mentioned this so that we can start discussion now.
> You do not need to implement batch deletion part, but let us
> have a design extensible for that.
>
> Thanks.

With a additional flag, code could be racy where you copy an old value
and delete the newest one.
So maybe we could implement BPF_MAP_DUMP_AND_DELETE as a wrapper of
map_get_next_key + map_lookup_and_delete_elem. Last function already
exists but it has not been implemented for maps other than stack and
queue.

Thanks for reviewing it!

^ permalink raw reply

* Re: [PATCH net-next 14/16] net/mlx5e: Recover from rx timeout
From: Jiri Pirko @ 2019-07-09 15:32 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: David S. Miller, netdev, Eran Ben Elisha, ayal, jiri,
	Saeed Mahameed, moshe
In-Reply-To: <1562500388-16847-15-git-send-email-tariqt@mellanox.com>

Sun, Jul 07, 2019 at 01:53:06PM CEST, tariqt@mellanox.com wrote:
>From: Aya Levin <ayal@mellanox.com>
>
>Add support for recovery from rx timeout. On driver open we post NOP
>work request on the rx channels to trigger napi in order to fillup the
>rx rings. In case napi wasn't scheduled due to a lost interrupt, perform
>EQ recovery.
>
>Signed-off-by: Aya Levin <ayal@mellanox.com>
>Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>---
> .../net/ethernet/mellanox/mlx5/core/en/health.h    |  1 +
> .../ethernet/mellanox/mlx5/core/en/reporter_rx.c   | 30 ++++++++++++++++++++++
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  1 +
> 3 files changed, 32 insertions(+)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/health.h b/drivers/net/ethernet/mellanox/mlx5/core/en/health.h
>index e8c5d3bd86f1..aa46f7ecae53 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en/health.h
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/health.h
>@@ -19,6 +19,7 @@
> int mlx5e_reporter_rx_create(struct mlx5e_priv *priv);
> void mlx5e_reporter_rx_destroy(struct mlx5e_priv *priv);
> void mlx5e_reporter_icosq_cqe_err(struct mlx5e_icosq *icosq);
>+void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq);
> 
> #define MLX5E_REPORTER_PER_Q_MAX_LEN 256
> 
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
>index c47e9a53bd53..7e7dba129330 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
>@@ -109,6 +109,36 @@ void mlx5e_reporter_icosq_cqe_err(struct mlx5e_icosq *icosq)
> 	mlx5e_health_report(priv, priv->rx_reporter, err_str, &err_ctx);
> }
> 
>+static int mlx5e_rx_reporter_timeout_recover(void *ctx)
>+{
>+	struct mlx5e_rq *rq = (struct mlx5e_rq *)ctx;

No need to cast. Please fix this in the rest of the patchset too.


>+	struct mlx5e_icosq *icosq = &rq->channel->icosq;
>+	struct mlx5_eq_comp *eq = rq->cq.mcq.eq;
>+	int err;
>+
>+	err = mlx5e_health_channel_eq_recover(eq, rq->channel);
>+	if (err)
>+		clear_bit(MLX5E_SQ_STATE_ENABLED, &icosq->state);
>+
>+	return err;
>+}
>+
>+void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq)
>+{
>+	struct mlx5e_icosq *icosq = &rq->channel->icosq;
>+	struct mlx5e_priv *priv = rq->channel->priv;
>+	char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN];
>+	struct mlx5e_err_ctx err_ctx = {};
>+
>+	err_ctx.ctx = rq;
>+	err_ctx.recover = mlx5e_rx_reporter_timeout_recover;
>+	sprintf(err_str,
>+		"RX timeout on channel: %d, ICOSQ: 0x%x RQ: 0x%x, CQ: 0x%x\n",
>+		icosq->channel->ix, icosq->sqn, rq->rqn, rq->cq.mcq.cqn);
>+
>+	mlx5e_health_report(priv, priv->rx_reporter, err_str, &err_ctx);
>+}
>+
> static int mlx5e_rx_reporter_recover_from_ctx(struct mlx5e_err_ctx *err_ctx)
> {
> 	return err_ctx->recover(err_ctx->ctx);
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>index 2d57611ac579..1ebdeccf395d 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>@@ -809,6 +809,7 @@ int mlx5e_wait_for_min_rx_wqes(struct mlx5e_rq *rq, int wait_time)
> 	netdev_warn(c->netdev, "Failed to get min RX wqes on Channel[%d] RQN[0x%x] wq cur_sz(%d) min_rx_wqes(%d)\n",
> 		    c->ix, rq->rqn, mlx5e_rqwq_get_cur_sz(rq), min_wqes);
> 
>+	mlx5e_reporter_rx_timeout(rq);
> 	return -ETIMEDOUT;
> }
> 
>-- 
>1.8.3.1
>

^ permalink raw reply

* Re: [PATCH net-next 13/16] net/mlx5e: Recover from CQE error on ICOSQ
From: Jiri Pirko @ 2019-07-09 15:30 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: David S. Miller, netdev, Eran Ben Elisha, ayal, jiri,
	Saeed Mahameed, moshe
In-Reply-To: <1562500388-16847-14-git-send-email-tariqt@mellanox.com>

Sun, Jul 07, 2019 at 01:53:05PM CEST, tariqt@mellanox.com wrote:
>From: Aya Levin <ayal@mellanox.com>
>
>Add support for recovery from error on completion on ICOSQ. Deactivate
>RQ and flush, then deactivate ICOSQ. Set the queue back to ready state
>(firmware) and reset the ICOSQ and the RQ (software resources). Finally,
>activate the ICOSQ and the RQ.
>
>Signed-off-by: Aya Levin <ayal@mellanox.com>
>Signed-off-by: Tariq Toukan <tariqt@mellanox.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* Re: [PATCH] crypto: user - make NETLINK_CRYPTO work inside netns
From: Ondrej Mosnacek @ 2019-07-09 15:28 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, netdev, David S . Miller, Stephan Mueller,
	Steffen Klassert, Don Zickus
In-Reply-To: <20190709143832.hej23rahmb4basy6@gondor.apana.org.au>

On Tue, Jul 9, 2019 at 4:38 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Tue, Jul 09, 2019 at 01:11:24PM +0200, Ondrej Mosnacek wrote:
> > Currently, NETLINK_CRYPTO works only in the init network namespace. It
> > doesn't make much sense to cut it out of the other network namespaces,
> > so do the minor plumbing work necessary to make it work in any network
> > namespace. Code inspired by net/core/sock_diag.c.
> >
> > Tested using kcapi-dgst from libkcapi [1]:
> > Before:
> >     # unshare -n kcapi-dgst -c sha256 </dev/null | wc -c
> >     libkcapi - Error: Netlink error: sendmsg failed
> >     libkcapi - Error: Netlink error: sendmsg failed
> >     libkcapi - Error: NETLINK_CRYPTO: cannot obtain cipher information for hmac(sha512) (is required crypto_user.c patch missing? see documentation)
> >     0
> >
> > After:
> >     # unshare -n kcapi-dgst -c sha256 </dev/null | wc -c
> >     32
> >
> > [1] https://github.com/smuellerDD/libkcapi
> >
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
>
> Should we really let root inside a namespace manipulate crypto
> algorithms which are global?

I admit I'm not an expert on Linux namespaces, but aren't you
confusing network and user namespaces? Unless I'm mistaken, these
changes only affect _network_ namespaces (which only isolate the
network stuff itself) and the semantics of the netlink_capable(skb,
CAP_NET_ADMIN) calls remain unchanged - they check if the opener of
the socket has the CAP_NET_ADMIN capability within the global _user_
namespace.

>
> I think we should only allow the query operations without deeper
> surgery.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

-- 
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply

* Re: [PATCH net-next v2 8/8] net: mscc: PTP Hardware Clock (PHC) support
From: Antoine Tenart @ 2019-07-09 15:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Antoine Tenart, davem, richardcochran, alexandre.belloni,
	UNGLinuxDriver, ralf, paul.burton, jhogan, netdev, linux-mips,
	thomas.petazzoni, allan.nielsen
In-Reply-To: <20190708120626.2cecc86b@cakuba.netronome.com>

Hello Jakub,

On Mon, Jul 08, 2019 at 12:06:26PM -0700, Jakub Kicinski wrote:
> On Mon, 8 Jul 2019 10:48:09 +0200, Antoine Tenart wrote:
> > > > +	/* Commit back the result & save it */
> > > > +	memcpy(&ocelot->hwtstamp_config, &cfg, sizeof(cfg));
> > > > +	mutex_unlock(&ocelot->ptp_lock);
> > > > +
> > > > +	return copy_to_user(ifr->ifr_data, &cfg, sizeof(cfg)) ? -EFAULT : 0;
> > > > +}
> > > >  
> > > > +static int ocelot_get_ts_info(struct net_device *dev,
> > > > +			      struct ethtool_ts_info *info)
> > > > +{
> > > > +	struct ocelot_port *ocelot_port = netdev_priv(dev);
> > > > +	struct ocelot *ocelot = ocelot_port->ocelot;
> > > > +	int ret;
> > > > +
> > > > +	if (!ocelot->ptp)
> > > > +		return -EOPNOTSUPP;  
> > > 
> > > Hmm.. why does software timestamping depend on PTP?  
> > 
> > Because it depends on the "PTP" register bank (and the "PTP" interrupt)
> > being described and available. This is why I named the flag 'ptp', but
> > it could be named 'timestamp' or 'ts' as well.
> 
> Right, but software timestamps are done by calling skb_tx_timestamp(skb)
> in the driver, no need for HW support there (software RX timestamp is
> handled by the stack).

I see, I should instead filter the flags based on this so that the s/w
ones still get set.

Thanks!
Antoine

-- 
Antoine Ténart, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* [PATCH bpf] selftests/bpf: fix bpf_target_sparc check
From: Ilya Leoshkevich @ 2019-07-09 15:21 UTC (permalink / raw)
  To: bpf, netdev; +Cc: Ilya Leoshkevich

bpf_helpers.h fails to compile on sparc: the code should be checking
for defined(bpf_target_sparc), but checks simply for bpf_target_sparc.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 5f6f9e7aba2a..a8fea087aa90 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -443,7 +443,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #ifdef bpf_target_powerpc
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = (ctx)->link; })
 #define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP
-#elif bpf_target_sparc
+#elif defined(bpf_target_sparc)
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = PT_REGS_RET(ctx); })
 #define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP
 #else
-- 
2.21.0


^ permalink raw reply related

* WARNING: refcount bug in nr_insert_socket
From: syzbot @ 2019-07-09 15:21 UTC (permalink / raw)
  To: davem, linux-hams, linux-kernel, netdev, ralf, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    4608a726 Add linux-next specific files for 20190709
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1387b608600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=7a02e36d356a9a17
dashboard link: https://syzkaller.appspot.com/bug?extid=ec1fd464d849d91c3665
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b47be8600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15172e7ba00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+ec1fd464d849d91c3665@syzkaller.appspotmail.com

------------[ cut here ]------------
refcount_t: increment on 0; use-after-free.
WARNING: CPU: 0 PID: 14391 at lib/refcount.c:156 refcount_inc_checked  
lib/refcount.c:156 [inline]
WARNING: CPU: 0 PID: 14391 at lib/refcount.c:156  
refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 14391 Comm: syz-executor638 Not tainted 5.2.0-next-20190709 #34
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2dc/0x755 kernel/panic.c:219
  __warn.cold+0x20/0x4c kernel/panic.c:576
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:1008
RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Code: 1d 83 26 64 06 31 ff 89 de e8 5b 44 35 fe 84 db 75 dd e8 12 43 35 fe  
48 c7 c7 60 04 c6 87 c6 05 63 26 64 06 01 e8 77 ab 06 fe <0f> 0b eb c1 90  
90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
RSP: 0018:ffff8880ae809bf0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000100 RSI: ffffffff815bfa86 RDI: ffffed1015d01370
RBP: ffff8880ae809c00 R08: ffff8880988924c0 R09: fffffbfff14a7757
R10: fffffbfff14a7756 R11: ffffffff8a53bab7 R12: ffff888097414c80
R13: ffff888097414c68 R14: ffff888096051348 R15: ffff888096051320
  sock_hold include/net/sock.h:649 [inline]
  sk_add_node include/net/sock.h:701 [inline]
  nr_insert_socket+0x2d/0xe0 net/netrom/af_netrom.c:137
  nr_rx_frame+0x1605/0x1e73 net/netrom/af_netrom.c:1023
  nr_loopback_timer+0x7b/0x170 net/netrom/nr_loopback.c:59
  call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1322
  expire_timers kernel/time/timer.c:1366 [inline]
  __run_timers kernel/time/timer.c:1685 [inline]
  __run_timers kernel/time/timer.c:1653 [inline]
  run_timer_softirq+0x697/0x17a0 kernel/time/timer.c:1698
  __do_softirq+0x262/0x98c kernel/softirq.c:292
  invoke_softirq kernel/softirq.c:373 [inline]
  irq_exit+0x19b/0x1e0 kernel/softirq.c:413
  exiting_irq arch/x86/include/asm/apic.h:537 [inline]
  smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1095
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:828
  </IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:767  
[inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160  
[inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x95/0xe0  
kernel/locking/spinlock.c:191
Code: 48 c7 c0 d0 e3 d2 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c  
10 00 75 39 48 83 3d d2 3e 99 01 00 74 24 48 89 df 57 9d <0f> 1f 44 00 00  
bf 01 00 00 00 e8 fc c8 14 fa 65 8b 05 6d 58 c8 78
RSP: 0018:ffff88808720fd10 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff11a5c7a RBX: 0000000000000286 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000006 RDI: 0000000000000286
RBP: ffff88808720fd20 R08: ffff8880988924c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8aa79aa8
R13: ffffffff8aa79aa0 R14: ffff88809683add0 R15: ffff88808720fdc0
  debug_object_free lib/debugobjects.c:823 [inline]
  debug_object_free+0x1f1/0x390 lib/debugobjects.c:796
  destroy_hrtimer_on_stack kernel/time/hrtimer.c:432 [inline]
  hrtimer_nanosleep+0x2d8/0x570 kernel/time/hrtimer.c:1748
  __do_sys_nanosleep kernel/time/hrtimer.c:1767 [inline]
  __se_sys_nanosleep kernel/time/hrtimer.c:1754 [inline]
  __x64_sys_nanosleep+0x1a6/0x220 kernel/time/hrtimer.c:1754
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x447811
Code: 75 14 b8 23 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 b4 1e fc ff c3 48  
83 ec 08 e8 6a 44 00 00 48 89 04 24 b8 23 00 00 00 0f 05 <48> 8b 3c 24 48  
89 c2 e8 b3 44 00 00 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffcca488140 EFLAGS: 00000293 ORIG_RAX: 0000000000000023
RAX: ffffffffffffffda RBX: 0000000000000048 RCX: 0000000000447811
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ffcca488150
RBP: 00000000006dfc6c R08: 00000000004b1a31 R09: 00000000004b1a31
R10: 00007ffcca488180 R11: 0000000000000293 R12: 00000000006dfc60
R13: 0000000000000002 R14: 000000000000002d R15: 20c49ba5e353f7cf
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: [PATCH net-next] bnxt_en: Add page_pool_destroy() during RX ring cleanup.
From: Ilias Apalodimas @ 2019-07-09 15:20 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Michael Chan, davem, netdev
In-Reply-To: <20190709131842.GJ87269@C02RW35GFVH8.dhcp.broadcom.net>

Hi,

> > Add page_pool_destroy() in bnxt_free_rx_rings() during normal RX ring
> > cleanup, as Ilias has informed us that the following commit has been
> > merged:
> > 
> > 1da4bbeffe41 ("net: core: page_pool: add user refcnt and reintroduce page_pool_destroy")
> > 
> > The special error handling code to call page_pool_free() can now be
> > removed.  bnxt_free_rx_rings() will always be called during normal
> > shutdown or any error paths.
> > 
> > Fixes: 322b87ca55f2 ("bnxt_en: add page_pool support")
> > Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > Cc: Andy Gospodarek <gospo@broadcom.com>
> > Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> > ---
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > index e9d3bd8..2b5b0ab 100644
> > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > @@ -2500,6 +2500,7 @@ static void bnxt_free_rx_rings(struct bnxt *bp)
> >  		if (xdp_rxq_info_is_reg(&rxr->xdp_rxq))
> >  			xdp_rxq_info_unreg(&rxr->xdp_rxq);
> >  
> > +		page_pool_destroy(rxr->page_pool);
> >  		rxr->page_pool = NULL;
> >  
> >  		kfree(rxr->rx_tpa);
> > @@ -2560,19 +2561,14 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
> >  			return rc;
> >  
> >  		rc = xdp_rxq_info_reg(&rxr->xdp_rxq, bp->dev, i);
> > -		if (rc < 0) {
> > -			page_pool_free(rxr->page_pool);
> > -			rxr->page_pool = NULL;
> > +		if (rc < 0)
> >  			return rc;
> > -		}
> >  
> >  		rc = xdp_rxq_info_reg_mem_model(&rxr->xdp_rxq,
> >  						MEM_TYPE_PAGE_POOL,
> >  						rxr->page_pool);
> >  		if (rc) {
> >  			xdp_rxq_info_unreg(&rxr->xdp_rxq);
> > -			page_pool_free(rxr->page_pool);
> > -			rxr->page_pool = NULL;
> 
> Rather than deleting these lines it would also be acceptable to do:
> 
>                 if (rc) {
>                         xdp_rxq_info_unreg(&rxr->xdp_rxq);
> -                       page_pool_free(rxr->page_pool);
> +                       page_pool_destroy(rxr->page_pool);
>                         rxr->page_pool = NULL;
>                         return rc;
>                 }
> 
> but anytime there is a failure to bnxt_alloc_rx_rings the driver will
> immediately follow it up with a call to bnxt_free_rx_rings, so
> page_pool_destroy will be called.
> 
> Thanks for pushing this out so quickly!
> 

I also can't find page_pool_release_page() or page_pool_put_page() called when
destroying the pool. Can you try to insmod -> do some traffic -> rmmod ?
If there's stale buffers that haven't been unmapped properly you'll get a
WARN_ON for them.
This part was added later on in the API when Jesper fixed in-flight packet
handling

> Acked-by: Andy Gospodarek <gospo@broadcom.com> 
> 

Thanks
/Ilias

^ permalink raw reply

* [PATCH v3 bpf-next 4/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Ilya Leoshkevich @ 2019-07-09 15:18 UTC (permalink / raw)
  To: bpf, netdev; +Cc: sdf, ys114321, davem, ast, daniel, Ilya Leoshkevich
In-Reply-To: <20190709151809.37539-1-iii@linux.ibm.com>

Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/progs/loop1.c | 2 +-
 tools/testing/selftests/bpf/progs/loop2.c | 2 +-
 tools/testing/selftests/bpf/progs/loop3.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/loop1.c b/tools/testing/selftests/bpf/progs/loop1.c
index dea395af9ea9..7cdb7f878310 100644
--- a/tools/testing/selftests/bpf/progs/loop1.c
+++ b/tools/testing/selftests/bpf/progs/loop1.c
@@ -18,7 +18,7 @@ int nested_loops(volatile struct pt_regs* ctx)
 	for (j = 0; j < 300; j++)
 		for (i = 0; i < j; i++) {
 			if (j & 1)
-				m = ctx->rax;
+				m = PT_REGS_RC(ctx);
 			else
 				m = j;
 			sum += i * m;
diff --git a/tools/testing/selftests/bpf/progs/loop2.c b/tools/testing/selftests/bpf/progs/loop2.c
index 0637bd8e8bcf..9b2f808a2863 100644
--- a/tools/testing/selftests/bpf/progs/loop2.c
+++ b/tools/testing/selftests/bpf/progs/loop2.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	int i = 0;
 
 	while (true) {
-		if (ctx->rax & 1)
+		if (PT_REGS_RC(ctx) & 1)
 			i += 3;
 		else
 			i += 7;
diff --git a/tools/testing/selftests/bpf/progs/loop3.c b/tools/testing/selftests/bpf/progs/loop3.c
index 30a0f6cba080..d727657d51e2 100644
--- a/tools/testing/selftests/bpf/progs/loop3.c
+++ b/tools/testing/selftests/bpf/progs/loop3.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	__u64 i = 0, sum = 0;
 	do {
 		i++;
-		sum += ctx->rax;
+		sum += PT_REGS_RC(ctx);
 	} while (i < 0x100000000ULL);
 	return sum;
 }
-- 
2.21.0


^ permalink raw reply related

* [PATCH v3 bpf-next 3/4] selftests/bpf: make PT_REGS_* work in userspace
From: Ilya Leoshkevich @ 2019-07-09 15:18 UTC (permalink / raw)
  To: bpf, netdev; +Cc: sdf, ys114321, davem, ast, daniel, Ilya Leoshkevich
In-Reply-To: <20190709151809.37539-1-iii@linux.ibm.com>

Right now, on certain architectures, these macros are usable only with
kernel headers. This patch makes it possible to use them with userspace
headers and, as a consequence, not only in BPF samples, but also in BPF
selftests.

On s390, provide the forward declaration of struct pt_regs and cast it
to user_pt_regs in PT_REGS_* macros. This is necessary, because instead
of the full struct pt_regs, s390 exposes only its first member
user_pt_regs to userspace, and bpf_helpers.h is used with both userspace
(in selftests) and kernel (in samples) headers. It was added in commit
466698e654e8 ("s390/bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type").

Ditto on arm64.

On x86, provide userspace versions of PT_REGS_* macros. Unlike s390 and
arm64, x86 provides struct pt_regs to both userspace and kernel, however,
with different member names.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 61 +++++++++++++++--------
 1 file changed, 41 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 73071a94769a..212ec564e5c3 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -358,6 +358,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #if defined(bpf_target_x86)
 
+#ifdef __KERNEL__
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
 #define PT_REGS_PARM3(x) ((x)->dx)
@@ -368,19 +369,35 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_RC(x) ((x)->ax)
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
+#else
+#define PT_REGS_PARM1(x) ((x)->rdi)
+#define PT_REGS_PARM2(x) ((x)->rsi)
+#define PT_REGS_PARM3(x) ((x)->rdx)
+#define PT_REGS_PARM4(x) ((x)->rcx)
+#define PT_REGS_PARM5(x) ((x)->r8)
+#define PT_REGS_RET(x) ((x)->rsp)
+#define PT_REGS_FP(x) ((x)->rbp)
+#define PT_REGS_RC(x) ((x)->rax)
+#define PT_REGS_SP(x) ((x)->rsp)
+#define PT_REGS_IP(x) ((x)->rip)
+#endif
 
 #elif defined(bpf_target_s390)
 
-#define PT_REGS_PARM1(x) ((x)->gprs[2])
-#define PT_REGS_PARM2(x) ((x)->gprs[3])
-#define PT_REGS_PARM3(x) ((x)->gprs[4])
-#define PT_REGS_PARM4(x) ((x)->gprs[5])
-#define PT_REGS_PARM5(x) ((x)->gprs[6])
-#define PT_REGS_RET(x) ((x)->gprs[14])
-#define PT_REGS_FP(x) ((x)->gprs[11]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->gprs[2])
-#define PT_REGS_SP(x) ((x)->gprs[15])
-#define PT_REGS_IP(x) ((x)->psw.addr)
+/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_S390 const volatile user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
+#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
+#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
+#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
+#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
+#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
+#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
 
 #elif defined(bpf_target_arm)
 
@@ -397,16 +414,20 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #elif defined(bpf_target_arm64)
 
-#define PT_REGS_PARM1(x) ((x)->regs[0])
-#define PT_REGS_PARM2(x) ((x)->regs[1])
-#define PT_REGS_PARM3(x) ((x)->regs[2])
-#define PT_REGS_PARM4(x) ((x)->regs[3])
-#define PT_REGS_PARM5(x) ((x)->regs[4])
-#define PT_REGS_RET(x) ((x)->regs[30])
-#define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->regs[0])
-#define PT_REGS_SP(x) ((x)->sp)
-#define PT_REGS_IP(x) ((x)->pc)
+/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_ARM64 const volatile struct user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
+#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
+#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
+#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
+#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
+#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
+#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
 
 #elif defined(bpf_target_mips)
 
-- 
2.21.0


^ permalink raw reply related

* [PATCH v3 bpf-next 2/4] selftests/bpf: fix s930 -> s390 typo
From: Ilya Leoshkevich @ 2019-07-09 15:18 UTC (permalink / raw)
  To: bpf, netdev; +Cc: sdf, ys114321, davem, ast, daniel, Ilya Leoshkevich
In-Reply-To: <20190709151809.37539-1-iii@linux.ibm.com>

Also check for __s390__ instead of __s390x__, just in case bpf_helpers.h
is ever used by 32-bit userspace.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 5a3d92c8bec8..73071a94769a 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -315,8 +315,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #if defined(__TARGET_ARCH_x86)
 	#define bpf_target_x86
 	#define bpf_target_defined
-#elif defined(__TARGET_ARCH_s930x)
-	#define bpf_target_s930x
+#elif defined(__TARGET_ARCH_s390)
+	#define bpf_target_s390
 	#define bpf_target_defined
 #elif defined(__TARGET_ARCH_arm)
 	#define bpf_target_arm
@@ -341,8 +341,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #ifndef bpf_target_defined
 #if defined(__x86_64__)
 	#define bpf_target_x86
-#elif defined(__s390x__)
-	#define bpf_target_s930x
+#elif defined(__s390__)
+	#define bpf_target_s390
 #elif defined(__arm__)
 	#define bpf_target_arm
 #elif defined(__aarch64__)
@@ -369,7 +369,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(bpf_target_s390x)
+#elif defined(bpf_target_s390)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
-- 
2.21.0


^ permalink raw reply related

* [PATCH v3 bpf-next 1/4] selftests/bpf: compile progs with -D__TARGET_ARCH_$(ARCH)
From: Ilya Leoshkevich @ 2019-07-09 15:18 UTC (permalink / raw)
  To: bpf, netdev; +Cc: sdf, ys114321, davem, ast, daniel, Ilya Leoshkevich
In-Reply-To: <20190709151809.37539-1-iii@linux.ibm.com>

This opens up the possibility of accessing registers in an
arch-independent way.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/Makefile | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 2620406a53ec..59d89d5aa05e 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+include ../../../scripts/Makefile.arch
 
 LIBDIR := ../../../lib
 BPFDIR := $(LIBDIR)/bpf
@@ -138,7 +139,8 @@ CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - </dev/null 2>&1 \
 
 CLANG_FLAGS = -I. -I./include/uapi -I../../../include/uapi \
 	      $(CLANG_SYS_INCLUDES) \
-	      -Wno-compare-distinct-pointer-types
+	      -Wno-compare-distinct-pointer-types \
+	      -D__TARGET_ARCH_$(ARCH)
 
 $(OUTPUT)/test_l4lb_noinline.o: CLANG_FLAGS += -fno-inline
 $(OUTPUT)/test_xdp_noinline.o: CLANG_FLAGS += -fno-inline
-- 
2.21.0


^ permalink raw reply related

* [PATCH v3 bpf-next 0/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Ilya Leoshkevich @ 2019-07-09 15:18 UTC (permalink / raw)
  To: bpf, netdev; +Cc: sdf, ys114321, davem, ast, daniel, Ilya Leoshkevich

Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.

This patch series consists of three preparatory commits, which make it
possible to use PT_REGS_RC in BPF selftests, followed by the actual fix.

Since the last time, I've tested it with x86_64-linux-gnu-,
aarch64-linux-gnu-, arm-linux-gnueabihf-, mips64el-linux-gnuabi64-,
powerpc64le-linux-gnu-, s390x-linux-gnu- and sparc64-linux-gnu-
compilers, and found that I also need to add arm64 support.

Like s390, arm64 exports user_pt_regs instead of struct pt_regs to
userspace.

I've also made fixes for a few unrelated build problems, which I will
post separately.

v1->v2: Split into multiple patches.
v2->v3: Added arm64 support.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>



^ permalink raw reply

* Re: [PATCH v2 net-next 3/3] tc-testing: introduce scapyPlugin for basic traffic
From: Alexander Aring @ 2019-07-09 15:10 UTC (permalink / raw)
  To: Lucas Bates
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, Marcelo Ricardo Leitner, Vlad Buslov,
	Davide Caratti, kernel
In-Reply-To: <CAMDBHYKwxnJYMp97vMmhZR5unqT7LyXivhqFm+-Vc59LMqmO4A@mail.gmail.com>

On Mon, Jul 08, 2019 at 09:28:09PM -0400, Lucas Bates wrote:
> Sorry Alex, I completely forgot about this email.
> On Thu, Jul 4, 2019 at 4:29 PM Alexander Aring <aring@mojatatu.com> wrote:
> >
> > Hi,
> >
> > On Wed, Jul 03, 2019 at 08:45:02PM -0400, Lucas Bates wrote:
> > > The scapyPlugin allows for simple traffic generation in tdc to
> > > test various tc features. It was tested with scapy v2.4.2, but
> > > should work with any successive version.
> > Is there a way to introduce thrid party scapy level descriptions which
> > are not upstream yet?
> 
> Upstream to scapy? Not yet.  This version of the plugin is extremely
> simple, and good for basic traffic.  I'll add features to it so we can
> get more creative with the packets that can be sent, though.
> 

Can you add this now? I have some tests here for ife and I am on the way
to send it upstream to scapy.

So far this isn't done yet, I like to provide them via a external
directory in the tctesting directory.

Thanks.

- Alex

^ permalink raw reply

* Fw: [Bug 204099] New: systemd-networkd fails on 5.2 - same version works on 5.1.16
From: Stephen Hemminger @ 2019-07-09 14:43 UTC (permalink / raw)
  To: netdev

Looks like the stricter netlink validation broke userspace.
This is bad.

Begin forwarded message:

Date: Tue, 09 Jul 2019 00:44:01 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 204099] New: systemd-networkd fails on 5.2 - same version works on 5.1.16


https://bugzilla.kernel.org/show_bug.cgi?id=204099

            Bug ID: 204099
           Summary: systemd-networkd fails on 5.2 - same version works on
                    5.1.16
           Product: Networking
           Version: 2.5
    Kernel Version: 5.2
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: low
          Priority: P1
         Component: Other
          Assignee: stephen@networkplumber.org
          Reporter: Ian.kumlien@gmail.com
        Regression: No

This is more FYI, I haven't had time to properly debug it.

Booting 5.2 causes systemd-networkd to fail to bring any interface up, it will
fail with: "Could not bring up interface: Invalid argument"

However, booting 5.1.16 with the same software works just fine.

Sounds like something was changed in, what I assume is, the netlink API

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox