Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] net: rtnetlink: fix info leak in RTM_GETSTATS call
From: David Miller @ 2017-10-03 17:19 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, keescook, dvyukov, andreyknvl, kcc, roopa, glider,
	edumazet
In-Reply-To: <1507026048-13734-1-git-send-email-nikolay@cumulusnetworks.com>

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Tue,  3 Oct 2017 13:20:48 +0300

> When RTM_GETSTATS was added the fields of its header struct were not all
> initialized when returning the result thus leaking 4 bytes of information
> to user-space per rtnl_fill_statsinfo call, so initialize them now. Thanks
> to Alexander Potapenko for the detailed report and bisection.
> 
> Reported-by: Alexander Potapenko <glider@google.com>
> Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] net: dsa: lan9303: make functions lan9303_mdio_phy_{read|write} static
From: David Miller @ 2017-10-03 17:20 UTC (permalink / raw)
  To: colin.king
  Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20171003103918.26934-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Tue,  3 Oct 2017 11:39:18 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> The functions lan9303_mdio_phy_write and lan9303_mdio_phy_read are local
> to the source and do not need to be in global scope, so make them static.
> 
> Cleans up sparse warnings:
> symbol 'lan9303_mdio_phy_write' was not declared. Should it be static?
> symbol 'lan9303_mdio_phy_read' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: dsa: mt7530: make functions mt7530_phy_write static
From: David Miller @ 2017-10-03 17:20 UTC (permalink / raw)
  To: colin.king
  Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20171003104633.27151-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Tue,  3 Oct 2017 11:46:33 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> The function mt7530_phy_write is local to the source and does not need to
> be in global scope, so make it static.
> 
> Cleans up sparse warnings:
> symbol 'mt7530_phy_write' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2 net-next] mlxsw: spectrum: Fix check for IS_ERR() instead of NULL
From: David Miller @ 2017-10-03 17:27 UTC (permalink / raw)
  To: dan.carpenter; +Cc: jiri, yotamg, idosch, netdev, kernel-janitors
In-Reply-To: <20171003105303.u7yrzxknddmmerol@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Tue, 3 Oct 2017 13:53:03 +0300

> mlxsw_afa_block_create() doesn't return error pointers, it returns NULL
> on error.
> 
> Fixes: 0e14c7777acb ("mlxsw: spectrum: Add the multicast routing hardware logic")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2 net-next] mlxsw: spectrum: Add missing error code on allocation failure
From: David Miller @ 2017-10-03 17:27 UTC (permalink / raw)
  To: dan.carpenter; +Cc: jiri, yotamg, idosch, netdev, kernel-janitors
In-Reply-To: <20171003105340.llwk5oajgrohbksu@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Tue, 3 Oct 2017 13:53:41 +0300

> We accidentally return success if the kmalloc_array() call fails.
> 
> Fixes: 0e14c7777acb ("mlxsw: spectrum: Add the multicast routing hardware logic")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied.

^ permalink raw reply

* RE: [PATCH] net: phy: DP83822 initial driver submission
From: Woojung.Huh @ 2017-10-03 17:43 UTC (permalink / raw)
  To: dmurphy, andrew, f.fainelli; +Cc: netdev
In-Reply-To: <20171003155316.12312-1-dmurphy@ti.com>

> +static int dp83822_suspend(struct phy_device *phydev)
> +{
> +	int value;
> +
> +	mutex_lock(&phydev->lock);
> +
> +	value = phy_read_mmd(phydev, DP83822_DEVADDR,
> MII_DP83822_WOL_CFG);
> +	if (~value & DP83822_WOL_EN) {
Same result, but how about " if (!(value & DP83822_WOL_EN))" ?

> +		value = phy_read(phydev, MII_BMCR);
> +		phy_write(phydev, MII_BMCR, value | BMCR_PDOWN);
> +	}
> +
> +	mutex_unlock(&phydev->lock);
> +
> +	return 0;
> +}

^ permalink raw reply

* [PATCH iproute2 0/3] ss: add AF_VSOCK support
From: Stefan Hajnoczi @ 2017-10-03 17:57 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Jorgen Hansen, Dexuan Cui, Stefan Hajnoczi

This patch series adds AF_VSOCK support to ss(8).  AF_VSOCK is a host<->guest
communications channel supported by VMware, KVM (virtio-vsock), and Hyper-V.

To dump AF_VSOCK sockets:

  $ ss --vsock

The vsock_diag.ko kernel module for Linux was posted in "[PATCH 0/5] VSOCK: add
sock_diag interface".  That patch series also adds the
<linux/vm_sockets_diag.h> header file that this patch series relies on.  The
Linux patches are also available here:

  https://github.com/stefanha/linux/commits/vsock-diag

Stefan Hajnoczi (3):
  ss: allow AF_FAMILY constants >32
  include: add <linux/vm_sockets_diag.h>
  ss: add AF_VSOCK support

 include/linux/vm_sockets_diag.h |  33 ++++++
 misc/ss.c                       | 238 +++++++++++++++++++++++++++++++++++-----
 man/man8/ss.8                   |   8 +-
 3 files changed, 249 insertions(+), 30 deletions(-)
 create mode 100644 include/linux/vm_sockets_diag.h

-- 
2.13.6

^ permalink raw reply

* [RFC 1/2] bpf: move instruction printing into a separate file
From: Jakub Kicinski @ 2017-10-03 17:57 UTC (permalink / raw)
  To: daniel, dsahern, alexei.starovoitov
  Cc: netdev, oss-drivers, david.beckett, Jakub Kicinski
In-Reply-To: <59D3B456.4020209@iogearbox.net>

Separate the instruction printing into a standalone source file.
This way sneaky code from tools/ can use it directly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
Like this?

 kernel/bpf/Makefile   |   1 +
 kernel/bpf/disasm.c   | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/disasm.h   |  29 +++++++
 kernel/bpf/verifier.c | 200 +----------------------------------------------
 4 files changed, 245 insertions(+), 197 deletions(-)
 create mode 100644 kernel/bpf/disasm.c
 create mode 100644 kernel/bpf/disasm.h

diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 897daa005b23..53fb09f92e3f 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -2,6 +2,7 @@ obj-y := core.o
 
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
+obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 ifeq ($(CONFIG_NET),y)
 obj-$(CONFIG_BPF_SYSCALL) += devmap.o
 ifeq ($(CONFIG_STREAM_PARSER),y)
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
new file mode 100644
index 000000000000..f2194fa442a4
--- /dev/null
+++ b/kernel/bpf/disasm.c
@@ -0,0 +1,212 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ * Copyright (c) 2016 Facebook
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/bpf.h>
+
+#include "disasm.h"
+
+#define __BPF_FUNC_STR_FN(x) [BPF_FUNC_ ## x] = __stringify(bpf_ ## x)
+static const char * const func_id_str[] = {
+	__BPF_FUNC_MAPPER(__BPF_FUNC_STR_FN)
+};
+#undef __BPF_FUNC_STR_FN
+
+const char *func_id_name(int id)
+{
+	BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
+
+	if (id >= 0 && id < __BPF_FUNC_MAX_ID && func_id_str[id])
+		return func_id_str[id];
+	else
+		return "unknown";
+}
+
+const char *const bpf_class_string[8] = {
+	[BPF_LD]    = "ld",
+	[BPF_LDX]   = "ldx",
+	[BPF_ST]    = "st",
+	[BPF_STX]   = "stx",
+	[BPF_ALU]   = "alu",
+	[BPF_JMP]   = "jmp",
+	[BPF_RET]   = "BUG",
+	[BPF_ALU64] = "alu64",
+};
+
+const char *const bpf_alu_string[16] = {
+	[BPF_ADD >> 4]  = "+=",
+	[BPF_SUB >> 4]  = "-=",
+	[BPF_MUL >> 4]  = "*=",
+	[BPF_DIV >> 4]  = "/=",
+	[BPF_OR  >> 4]  = "|=",
+	[BPF_AND >> 4]  = "&=",
+	[BPF_LSH >> 4]  = "<<=",
+	[BPF_RSH >> 4]  = ">>=",
+	[BPF_NEG >> 4]  = "neg",
+	[BPF_MOD >> 4]  = "%=",
+	[BPF_XOR >> 4]  = "^=",
+	[BPF_MOV >> 4]  = "=",
+	[BPF_ARSH >> 4] = "s>>=",
+	[BPF_END >> 4]  = "endian",
+};
+
+static const char *const bpf_ldst_string[] = {
+	[BPF_W >> 3]  = "u32",
+	[BPF_H >> 3]  = "u16",
+	[BPF_B >> 3]  = "u8",
+	[BPF_DW >> 3] = "u64",
+};
+
+static const char *const bpf_jmp_string[16] = {
+	[BPF_JA >> 4]   = "jmp",
+	[BPF_JEQ >> 4]  = "==",
+	[BPF_JGT >> 4]  = ">",
+	[BPF_JLT >> 4]  = "<",
+	[BPF_JGE >> 4]  = ">=",
+	[BPF_JLE >> 4]  = "<=",
+	[BPF_JSET >> 4] = "&",
+	[BPF_JNE >> 4]  = "!=",
+	[BPF_JSGT >> 4] = "s>",
+	[BPF_JSLT >> 4] = "s<",
+	[BPF_JSGE >> 4] = "s>=",
+	[BPF_JSLE >> 4] = "s<=",
+	[BPF_CALL >> 4] = "call",
+	[BPF_EXIT >> 4] = "exit",
+};
+
+static void print_bpf_end_insn(void (*verbose)(const char *, ...),
+			       const struct bpf_insn *insn)
+{
+	verbose("(%02x) r%d = %s%d r%d\n", insn->code, insn->dst_reg,
+		BPF_SRC(insn->code) == BPF_TO_BE ? "be" : "le",
+		insn->imm, insn->dst_reg);
+}
+
+void print_bpf_insn(void (*verbose)(const char *, ...),
+		    const struct bpf_insn *insn, bool allow_ptr_leaks)
+{
+	u8 class = BPF_CLASS(insn->code);
+
+	if (class == BPF_ALU || class == BPF_ALU64) {
+		if (BPF_OP(insn->code) == BPF_END) {
+			if (class == BPF_ALU64)
+				verbose("BUG_alu64_%02x\n", insn->code);
+			else
+				print_bpf_end_insn(verbose, insn);
+		} else if (BPF_OP(insn->code) == BPF_NEG) {
+			verbose("(%02x) r%d = %s-r%d\n",
+				insn->code, insn->dst_reg,
+				class == BPF_ALU ? "(u32) " : "",
+				insn->dst_reg);
+		} else if (BPF_SRC(insn->code) == BPF_X) {
+			verbose("(%02x) %sr%d %s %sr%d\n",
+				insn->code, class == BPF_ALU ? "(u32) " : "",
+				insn->dst_reg,
+				bpf_alu_string[BPF_OP(insn->code) >> 4],
+				class == BPF_ALU ? "(u32) " : "",
+				insn->src_reg);
+		} else {
+			verbose("(%02x) %sr%d %s %s%d\n",
+				insn->code, class == BPF_ALU ? "(u32) " : "",
+				insn->dst_reg,
+				bpf_alu_string[BPF_OP(insn->code) >> 4],
+				class == BPF_ALU ? "(u32) " : "",
+				insn->imm);
+		}
+	} else if (class == BPF_STX) {
+		if (BPF_MODE(insn->code) == BPF_MEM)
+			verbose("(%02x) *(%s *)(r%d %+d) = r%d\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->dst_reg,
+				insn->off, insn->src_reg);
+		else if (BPF_MODE(insn->code) == BPF_XADD)
+			verbose("(%02x) lock *(%s *)(r%d %+d) += r%d\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->dst_reg, insn->off,
+				insn->src_reg);
+		else
+			verbose("BUG_%02x\n", insn->code);
+	} else if (class == BPF_ST) {
+		if (BPF_MODE(insn->code) != BPF_MEM) {
+			verbose("BUG_st_%02x\n", insn->code);
+			return;
+		}
+		verbose("(%02x) *(%s *)(r%d %+d) = %d\n",
+			insn->code,
+			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+			insn->dst_reg,
+			insn->off, insn->imm);
+	} else if (class == BPF_LDX) {
+		if (BPF_MODE(insn->code) != BPF_MEM) {
+			verbose("BUG_ldx_%02x\n", insn->code);
+			return;
+		}
+		verbose("(%02x) r%d = *(%s *)(r%d %+d)\n",
+			insn->code, insn->dst_reg,
+			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+			insn->src_reg, insn->off);
+	} else if (class == BPF_LD) {
+		if (BPF_MODE(insn->code) == BPF_ABS) {
+			verbose("(%02x) r0 = *(%s *)skb[%d]\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->imm);
+		} else if (BPF_MODE(insn->code) == BPF_IND) {
+			verbose("(%02x) r0 = *(%s *)skb[r%d + %d]\n",
+				insn->code,
+				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+				insn->src_reg, insn->imm);
+		} else if (BPF_MODE(insn->code) == BPF_IMM &&
+			   BPF_SIZE(insn->code) == BPF_DW) {
+			/* At this point, we already made sure that the second
+			 * part of the ldimm64 insn is accessible.
+			 */
+			u64 imm = ((u64)(insn + 1)->imm << 32) | (u32)insn->imm;
+			bool map_ptr = insn->src_reg == BPF_PSEUDO_MAP_FD;
+
+			if (map_ptr && allow_ptr_leaks)
+				imm = 0;
+
+			verbose("(%02x) r%d = 0x%llx\n", insn->code,
+				insn->dst_reg, (unsigned long long)imm);
+		} else {
+			verbose("BUG_ld_%02x\n", insn->code);
+			return;
+		}
+	} else if (class == BPF_JMP) {
+		u8 opcode = BPF_OP(insn->code);
+
+		if (opcode == BPF_CALL) {
+			verbose("(%02x) call %s#%d\n", insn->code,
+				func_id_name(insn->imm), insn->imm);
+		} else if (insn->code == (BPF_JMP | BPF_JA)) {
+			verbose("(%02x) goto pc%+d\n",
+				insn->code, insn->off);
+		} else if (insn->code == (BPF_JMP | BPF_EXIT)) {
+			verbose("(%02x) exit\n", insn->code);
+		} else if (BPF_SRC(insn->code) == BPF_X) {
+			verbose("(%02x) if r%d %s r%d goto pc%+d\n",
+				insn->code, insn->dst_reg,
+				bpf_jmp_string[BPF_OP(insn->code) >> 4],
+				insn->src_reg, insn->off);
+		} else {
+			verbose("(%02x) if r%d %s 0x%x goto pc%+d\n",
+				insn->code, insn->dst_reg,
+				bpf_jmp_string[BPF_OP(insn->code) >> 4],
+				insn->imm, insn->off);
+		}
+	} else {
+		verbose("(%02x) %s\n", insn->code, bpf_class_string[class]);
+	}
+}
diff --git a/kernel/bpf/disasm.h b/kernel/bpf/disasm.h
new file mode 100644
index 000000000000..fa5637334bb1
--- /dev/null
+++ b/kernel/bpf/disasm.h
@@ -0,0 +1,29 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ * Copyright (c) 2016 Facebook
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef __BPF_DISASM_H__
+#define __BPF_DISASM_H__
+
+#include <linux/bpf.h>
+#include <linux/kernel.h>
+#include <linux/stringify.h>
+
+extern const char *const bpf_alu_string[16];
+extern const char *const bpf_class_string[8];
+
+const char *func_id_name(int id);
+
+void print_bpf_insn(void (*verbose)(const char *, ...),
+		    const struct bpf_insn *insn, bool allow_ptr_leaks);
+
+#endif
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4cf9b72c59a0..1e40b5d41b9e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21,6 +21,8 @@
 #include <linux/vmalloc.h>
 #include <linux/stringify.h>
 
+#include "disasm.h"
+
 /* bpf_check() is a static code analyzer that walks eBPF program
  * instruction by instruction and updates register/stack state.
  * All paths of conditional branches are analyzed until 'bpf_exit' insn.
@@ -197,22 +199,6 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_PACKET_END]	= "pkt_end",
 };
 
-#define __BPF_FUNC_STR_FN(x) [BPF_FUNC_ ## x] = __stringify(bpf_ ## x)
-static const char * const func_id_str[] = {
-	__BPF_FUNC_MAPPER(__BPF_FUNC_STR_FN)
-};
-#undef __BPF_FUNC_STR_FN
-
-static const char *func_id_name(int id)
-{
-	BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
-
-	if (id >= 0 && id < __BPF_FUNC_MAX_ID && func_id_str[id])
-		return func_id_str[id];
-	else
-		return "unknown";
-}
-
 static void print_verifier_state(struct bpf_verifier_state *state)
 {
 	struct bpf_reg_state *reg;
@@ -280,186 +266,6 @@ static void print_verifier_state(struct bpf_verifier_state *state)
 	verbose("\n");
 }
 
-static const char *const bpf_class_string[] = {
-	[BPF_LD]    = "ld",
-	[BPF_LDX]   = "ldx",
-	[BPF_ST]    = "st",
-	[BPF_STX]   = "stx",
-	[BPF_ALU]   = "alu",
-	[BPF_JMP]   = "jmp",
-	[BPF_RET]   = "BUG",
-	[BPF_ALU64] = "alu64",
-};
-
-static const char *const bpf_alu_string[16] = {
-	[BPF_ADD >> 4]  = "+=",
-	[BPF_SUB >> 4]  = "-=",
-	[BPF_MUL >> 4]  = "*=",
-	[BPF_DIV >> 4]  = "/=",
-	[BPF_OR  >> 4]  = "|=",
-	[BPF_AND >> 4]  = "&=",
-	[BPF_LSH >> 4]  = "<<=",
-	[BPF_RSH >> 4]  = ">>=",
-	[BPF_NEG >> 4]  = "neg",
-	[BPF_MOD >> 4]  = "%=",
-	[BPF_XOR >> 4]  = "^=",
-	[BPF_MOV >> 4]  = "=",
-	[BPF_ARSH >> 4] = "s>>=",
-	[BPF_END >> 4]  = "endian",
-};
-
-static const char *const bpf_ldst_string[] = {
-	[BPF_W >> 3]  = "u32",
-	[BPF_H >> 3]  = "u16",
-	[BPF_B >> 3]  = "u8",
-	[BPF_DW >> 3] = "u64",
-};
-
-static const char *const bpf_jmp_string[16] = {
-	[BPF_JA >> 4]   = "jmp",
-	[BPF_JEQ >> 4]  = "==",
-	[BPF_JGT >> 4]  = ">",
-	[BPF_JLT >> 4]  = "<",
-	[BPF_JGE >> 4]  = ">=",
-	[BPF_JLE >> 4]  = "<=",
-	[BPF_JSET >> 4] = "&",
-	[BPF_JNE >> 4]  = "!=",
-	[BPF_JSGT >> 4] = "s>",
-	[BPF_JSLT >> 4] = "s<",
-	[BPF_JSGE >> 4] = "s>=",
-	[BPF_JSLE >> 4] = "s<=",
-	[BPF_CALL >> 4] = "call",
-	[BPF_EXIT >> 4] = "exit",
-};
-
-static void print_bpf_end_insn(const struct bpf_verifier_env *env,
-			       const struct bpf_insn *insn)
-{
-	verbose("(%02x) r%d = %s%d r%d\n", insn->code, insn->dst_reg,
-		BPF_SRC(insn->code) == BPF_TO_BE ? "be" : "le",
-		insn->imm, insn->dst_reg);
-}
-
-static void print_bpf_insn(const struct bpf_verifier_env *env,
-			   const struct bpf_insn *insn)
-{
-	u8 class = BPF_CLASS(insn->code);
-
-	if (class == BPF_ALU || class == BPF_ALU64) {
-		if (BPF_OP(insn->code) == BPF_END) {
-			if (class == BPF_ALU64)
-				verbose("BUG_alu64_%02x\n", insn->code);
-			else
-				print_bpf_end_insn(env, insn);
-		} else if (BPF_OP(insn->code) == BPF_NEG) {
-			verbose("(%02x) r%d = %s-r%d\n",
-				insn->code, insn->dst_reg,
-				class == BPF_ALU ? "(u32) " : "",
-				insn->dst_reg);
-		} else if (BPF_SRC(insn->code) == BPF_X) {
-			verbose("(%02x) %sr%d %s %sr%d\n",
-				insn->code, class == BPF_ALU ? "(u32) " : "",
-				insn->dst_reg,
-				bpf_alu_string[BPF_OP(insn->code) >> 4],
-				class == BPF_ALU ? "(u32) " : "",
-				insn->src_reg);
-		} else {
-			verbose("(%02x) %sr%d %s %s%d\n",
-				insn->code, class == BPF_ALU ? "(u32) " : "",
-				insn->dst_reg,
-				bpf_alu_string[BPF_OP(insn->code) >> 4],
-				class == BPF_ALU ? "(u32) " : "",
-				insn->imm);
-		}
-	} else if (class == BPF_STX) {
-		if (BPF_MODE(insn->code) == BPF_MEM)
-			verbose("(%02x) *(%s *)(r%d %+d) = r%d\n",
-				insn->code,
-				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-				insn->dst_reg,
-				insn->off, insn->src_reg);
-		else if (BPF_MODE(insn->code) == BPF_XADD)
-			verbose("(%02x) lock *(%s *)(r%d %+d) += r%d\n",
-				insn->code,
-				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-				insn->dst_reg, insn->off,
-				insn->src_reg);
-		else
-			verbose("BUG_%02x\n", insn->code);
-	} else if (class == BPF_ST) {
-		if (BPF_MODE(insn->code) != BPF_MEM) {
-			verbose("BUG_st_%02x\n", insn->code);
-			return;
-		}
-		verbose("(%02x) *(%s *)(r%d %+d) = %d\n",
-			insn->code,
-			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-			insn->dst_reg,
-			insn->off, insn->imm);
-	} else if (class == BPF_LDX) {
-		if (BPF_MODE(insn->code) != BPF_MEM) {
-			verbose("BUG_ldx_%02x\n", insn->code);
-			return;
-		}
-		verbose("(%02x) r%d = *(%s *)(r%d %+d)\n",
-			insn->code, insn->dst_reg,
-			bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-			insn->src_reg, insn->off);
-	} else if (class == BPF_LD) {
-		if (BPF_MODE(insn->code) == BPF_ABS) {
-			verbose("(%02x) r0 = *(%s *)skb[%d]\n",
-				insn->code,
-				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-				insn->imm);
-		} else if (BPF_MODE(insn->code) == BPF_IND) {
-			verbose("(%02x) r0 = *(%s *)skb[r%d + %d]\n",
-				insn->code,
-				bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
-				insn->src_reg, insn->imm);
-		} else if (BPF_MODE(insn->code) == BPF_IMM &&
-			   BPF_SIZE(insn->code) == BPF_DW) {
-			/* At this point, we already made sure that the second
-			 * part of the ldimm64 insn is accessible.
-			 */
-			u64 imm = ((u64)(insn + 1)->imm << 32) | (u32)insn->imm;
-			bool map_ptr = insn->src_reg == BPF_PSEUDO_MAP_FD;
-
-			if (map_ptr && !env->allow_ptr_leaks)
-				imm = 0;
-
-			verbose("(%02x) r%d = 0x%llx\n", insn->code,
-				insn->dst_reg, (unsigned long long)imm);
-		} else {
-			verbose("BUG_ld_%02x\n", insn->code);
-			return;
-		}
-	} else if (class == BPF_JMP) {
-		u8 opcode = BPF_OP(insn->code);
-
-		if (opcode == BPF_CALL) {
-			verbose("(%02x) call %s#%d\n", insn->code,
-				func_id_name(insn->imm), insn->imm);
-		} else if (insn->code == (BPF_JMP | BPF_JA)) {
-			verbose("(%02x) goto pc%+d\n",
-				insn->code, insn->off);
-		} else if (insn->code == (BPF_JMP | BPF_EXIT)) {
-			verbose("(%02x) exit\n", insn->code);
-		} else if (BPF_SRC(insn->code) == BPF_X) {
-			verbose("(%02x) if r%d %s r%d goto pc%+d\n",
-				insn->code, insn->dst_reg,
-				bpf_jmp_string[BPF_OP(insn->code) >> 4],
-				insn->src_reg, insn->off);
-		} else {
-			verbose("(%02x) if r%d %s 0x%x goto pc%+d\n",
-				insn->code, insn->dst_reg,
-				bpf_jmp_string[BPF_OP(insn->code) >> 4],
-				insn->imm, insn->off);
-		}
-	} else {
-		verbose("(%02x) %s\n", insn->code, bpf_class_string[class]);
-	}
-}
-
 static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx)
 {
 	struct bpf_verifier_stack_elem *elem;
@@ -3693,7 +3499,7 @@ static int do_check(struct bpf_verifier_env *env)
 
 		if (log_level) {
 			verbose("%d: ", insn_idx);
-			print_bpf_insn(env, insn);
+			print_bpf_insn(verbose, insn, env->allow_ptr_leaks);
 		}
 
 		err = ext_analyzer_insn_hook(env, insn_idx, prev_insn_idx);
-- 
2.14.1

^ permalink raw reply related

* [RFC 2/2] tools: bpftool: use the kernel's instruction printer
From: Jakub Kicinski @ 2017-10-03 17:57 UTC (permalink / raw)
  To: daniel, dsahern, alexei.starovoitov
  Cc: netdev, oss-drivers, david.beckett, Jakub Kicinski
In-Reply-To: <20171003175746.30145-1-jakub.kicinski@netronome.com>

Compile the instruction printer from kernel/bpf and use it
for disassembling "translated" eBPF code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 tools/bpf/bpftool/Documentation/bpftool-prog.txt | 11 +++---
 tools/bpf/bpftool/Makefile                       |  7 ++--
 tools/bpf/bpftool/main.h                         | 10 ++----
 tools/bpf/bpftool/prog.c                         | 44 +++++++++++++++++++-----
 4 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.txt b/tools/bpf/bpftool/Documentation/bpftool-prog.txt
index 79791bf11e4d..7f33260f5683 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.txt
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 ========
 
 |	**bpftool** prog show [*PROG*]
-|	**bpftool** prog dump xlated *PROG*  file *FILE*
+|	**bpftool** prog dump xlated *PROG* [file *FILE*] [opcodes]
 |	**bpftool** prog dump jited  *PROG* [file *FILE*] [opcodes]
 |	**bpftool** prog pin *PROG* *FILE*
 |	**bpftool** prog help
@@ -28,9 +28,12 @@ DESCRIPTION
 		  Output will start with program ID followed by program type and
 		  zero or more named attributes (depending on kernel version).
 
-	**bpftool prog dump xlated** *PROG*  **file** *FILE*
-		  Dump eBPF instructions of the program from the kernel to a
-		  file.
+	**bpftool prog dump xlated** *PROG* [**file** *FILE*] [**opcodes**]
+		  Dump eBPF instructions of the program from the kernel.
+		  If *FILE* is specified image will be written to a file,
+		  otherwise it will be disassembled and printed to stdout.
+
+		  **opcodes** controls if raw opcodes will be printed.
 
 	**bpftool prog dump jited**  *PROG* [**file** *FILE*] [**opcodes**]
 		  Dump jited image (host machine code) of the program.
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 8705ee44664d..4f339824ca57 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -51,7 +51,7 @@ CC = gcc
 
 CFLAGS += -O2
 CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wshadow
-CFLAGS += -D__EXPORTED_HEADERS__ -I$(srctree)/tools/include/uapi -I$(srctree)/tools/include -I$(srctree)/tools/lib/bpf
+CFLAGS += -D__EXPORTED_HEADERS__ -I$(srctree)/tools/include/uapi -I$(srctree)/tools/include -I$(srctree)/tools/lib/bpf -I$(srctree)/kernel/bpf/
 LIBS = -lelf -lbfd -lopcodes $(LIBBPF)
 
 include $(wildcard *.d)
@@ -59,7 +59,10 @@ include $(wildcard *.d)
 all: $(OUTPUT)bpftool
 
 SRCS=$(wildcard *.c)
-OBJS=$(patsubst %.c,$(OUTPUT)%.o,$(SRCS))
+OBJS=$(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
+
+$(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
+	$(QUIET_CC)$(COMPILE.c) -MMD -o $@ $<
 
 $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
 	$(QUIET_LINK)$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 85d2d7870a58..d7dd9965f40a 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -36,11 +36,12 @@
 #ifndef __BPF_TOOL_H
 #define __BPF_TOOL_H
 
+/* BDF and kernel.h both define GCC_VERSION, differently */
+#undef GCC_VERSION
 #include <stdbool.h>
 #include <stdio.h>
 #include <linux/bpf.h>
-
-#define ARRAY_SIZE(a)	(sizeof(a) / sizeof(a[0]))
+#include <linux/kernel.h>
 
 #define err(msg...)	fprintf(stderr, "Error: " msg)
 #define warn(msg...)	fprintf(stderr, "Warning: " msg)
@@ -48,11 +49,6 @@
 
 #define ptr_to_u64(ptr)	((__u64)(unsigned long)(ptr))
 
-#define min(a, b)							\
-	({ typeof(a) _a = (a); typeof(b) _b = (b); _a > _b ? _b : _a; })
-#define max(a, b)							\
-	({ typeof(a) _a = (a); typeof(b) _b = (b); _a < _b ? _b : _a; })
-
 #define NEXT_ARG()	({ argc--; argv++; if (argc < 0) usage(); })
 #define NEXT_ARGP()	({ (*argc)--; (*argv)++; if (*argc < 0) usage(); })
 #define BAD_ARG()	({ err("what is '%s'?\n", *argv); -1; })
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 421ba89ce86a..2cdc8f41809f 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -35,6 +35,7 @@
 
 #include <errno.h>
 #include <fcntl.h>
+#include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -46,6 +47,7 @@
 #include <bpf.h>
 
 #include "main.h"
+#include "disasm.h"
 
 static const char * const prog_type_name[] = {
 	[BPF_PROG_TYPE_UNSPEC]		= "unspec",
@@ -297,11 +299,39 @@ static int do_show(int argc, char **argv)
 	return 0;
 }
 
+static void print_insn(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	vprintf(fmt, args);
+	va_end(args);
+}
+
+static void dump_xlated(void *buf, unsigned int len, bool opcodes)
+{
+	struct bpf_insn *insn = buf;
+	unsigned int i;
+
+	for (i = 0; i < len / sizeof(*insn); i++) {
+		printf("% 4d: ", i);
+		print_bpf_insn(print_insn, insn + i, true);
+
+		if (opcodes) {
+			printf("       ");
+			print_hex(insn + i, 8, " ");
+			printf("\n");
+		}
+
+		if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW))
+			i++;
+	}
+}
+
 static int do_dump(int argc, char **argv)
 {
 	struct bpf_prog_info info = {};
 	__u32 len = sizeof(info);
-	bool can_disasm = false;
 	unsigned int buf_size;
 	char *filepath = NULL;
 	bool opcodes = false;
@@ -315,7 +345,6 @@ static int do_dump(int argc, char **argv)
 	if (is_prefix(*argv, "jited")) {
 		member_len = &info.jited_prog_len;
 		member_ptr = &info.jited_prog_insns;
-		can_disasm = true;
 	} else if (is_prefix(*argv, "xlated")) {
 		member_len = &info.xlated_prog_len;
 		member_ptr = &info.xlated_prog_insns;
@@ -346,10 +375,6 @@ static int do_dump(int argc, char **argv)
 		NEXT_ARG();
 	}
 
-	if (!filepath && !can_disasm) {
-		err("expected 'file' got %s\n", *argv);
-		return -1;
-	}
 	if (argc) {
 		usage();
 		return -1;
@@ -409,7 +434,10 @@ static int do_dump(int argc, char **argv)
 			goto err_free;
 		}
 	} else {
-		disasm_print_insn(buf, *member_len, opcodes);
+		if (member_len == &info.jited_prog_len)
+			disasm_print_insn(buf, *member_len, opcodes);
+		else
+			dump_xlated(buf, *member_len, opcodes);
 	}
 
 	free(buf);
@@ -430,7 +458,7 @@ static int do_help(int argc, char **argv)
 {
 	fprintf(stderr,
 		"Usage: %s %s show [PROG]\n"
-		"       %s %s dump xlated PROG  file FILE\n"
+		"       %s %s dump xlated PROG [file FILE] [opcodes]\n"
 		"       %s %s dump jited  PROG [file FILE] [opcodes]\n"
 		"       %s %s pin   PROG FILE\n"
 		"       %s %s help\n"
-- 
2.14.1

^ permalink raw reply related

* [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: Stefan Hajnoczi @ 2017-10-03 17:57 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Jorgen Hansen, Dexuan Cui, Stefan Hajnoczi
In-Reply-To: <20171003175744.24987-1-stefanha@redhat.com>

Linux has more than 32 address families defined in <bits/socket.h>.  Use
a 64-bit type so all of them can be represented in the filter->families
bitmask.

It's easy to introduce bugs when using (1 << AF_FAMILY) because the
value is 32-bit.  This can produce incorrect results from bitmask
operations so introduce the FAMILY_MASK() macro to eliminate these bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 misc/ss.c | 54 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index dd8dfaa4..12a31c90 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -170,55 +170,57 @@ enum {
 struct filter {
 	int dbs;
 	int states;
-	int families;
+	__u64 families;
 	struct ssfilter *f;
 	bool kill;
 };
 
+#define FAMILY_MASK(family) ((__u64)1 << (family))
+
 static const struct filter default_dbs[MAX_DB] = {
 	[TCP_DB] = {
 		.states   = SS_CONN,
-		.families = (1 << AF_INET) | (1 << AF_INET6),
+		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
 	[DCCP_DB] = {
 		.states   = SS_CONN,
-		.families = (1 << AF_INET) | (1 << AF_INET6),
+		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
 	[UDP_DB] = {
 		.states   = (1 << SS_ESTABLISHED),
-		.families = (1 << AF_INET) | (1 << AF_INET6),
+		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
 	[RAW_DB] = {
 		.states   = (1 << SS_ESTABLISHED),
-		.families = (1 << AF_INET) | (1 << AF_INET6),
+		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
 	[UNIX_DG_DB] = {
 		.states   = (1 << SS_CLOSE),
-		.families = (1 << AF_UNIX),
+		.families = FAMILY_MASK(AF_UNIX),
 	},
 	[UNIX_ST_DB] = {
 		.states   = SS_CONN,
-		.families = (1 << AF_UNIX),
+		.families = FAMILY_MASK(AF_UNIX),
 	},
 	[UNIX_SQ_DB] = {
 		.states   = SS_CONN,
-		.families = (1 << AF_UNIX),
+		.families = FAMILY_MASK(AF_UNIX),
 	},
 	[PACKET_DG_DB] = {
 		.states   = (1 << SS_CLOSE),
-		.families = (1 << AF_PACKET),
+		.families = FAMILY_MASK(AF_PACKET),
 	},
 	[PACKET_R_DB] = {
 		.states   = (1 << SS_CLOSE),
-		.families = (1 << AF_PACKET),
+		.families = FAMILY_MASK(AF_PACKET),
 	},
 	[NETLINK_DB] = {
 		.states   = (1 << SS_CLOSE),
-		.families = (1 << AF_NETLINK),
+		.families = FAMILY_MASK(AF_NETLINK),
 	},
 	[SCTP_DB] = {
 		.states   = SS_CONN,
-		.families = (1 << AF_INET) | (1 << AF_INET6),
+		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
 };
 
@@ -258,14 +260,14 @@ static void filter_db_set(struct filter *f, int db)
 static void filter_af_set(struct filter *f, int af)
 {
 	f->states	   |= default_afs[af].states;
-	f->families	   |= 1 << af;
+	f->families	   |= FAMILY_MASK(af);
 	do_default	    = 0;
 	preferred_family    = af;
 }
 
 static int filter_af_get(struct filter *f, int af)
 {
-	return f->families & (1 << af);
+	return !!(f->families & FAMILY_MASK(af));
 }
 
 static void filter_default_dbs(struct filter *f)
@@ -302,7 +304,7 @@ static void filter_merge_defaults(struct filter *f)
 			f->families |= default_dbs[db].families;
 	}
 	for (af = 0; af < AF_MAX; af++) {
-		if (!(f->families & (1 << af)))
+		if (!(f->families & FAMILY_MASK(af)))
 			continue;
 
 		if (!(default_afs[af].dbs & f->dbs))
@@ -2608,7 +2610,7 @@ static int show_one_inet_sock(const struct sockaddr_nl *addr,
 	struct inet_diag_msg *r = NLMSG_DATA(h);
 	struct sockstat s = {};
 
-	if (!(diag_arg->f->families & (1 << r->idiag_family)))
+	if (!(diag_arg->f->families & FAMILY_MASK(r->idiag_family)))
 		return 0;
 
 	parse_diag_msg(h, &s);
@@ -2802,7 +2804,7 @@ static int tcp_show(struct filter *f)
 		return -1;
 	}
 
-	if (f->families & (1<<AF_INET)) {
+	if (f->families & FAMILY_MASK(AF_INET)) {
 		if ((fp = net_tcp_open()) == NULL)
 			goto outerr;
 
@@ -2812,7 +2814,7 @@ static int tcp_show(struct filter *f)
 		fclose(fp);
 	}
 
-	if ((f->families & (1<<AF_INET6)) &&
+	if ((f->families & FAMILY_MASK(AF_INET6)) &&
 	    (fp = net_tcp6_open()) != NULL) {
 		setbuffer(fp, buf, bufsize);
 		if (generic_record_read(fp, tcp_show_line, f, AF_INET6))
@@ -2911,7 +2913,7 @@ static int udp_show(struct filter *f)
 	    && inet_show_netlink(f, NULL, IPPROTO_UDP) == 0)
 		return 0;
 
-	if (f->families&(1<<AF_INET)) {
+	if (f->families&FAMILY_MASK(AF_INET)) {
 		if ((fp = net_udp_open()) == NULL)
 			goto outerr;
 		if (generic_record_read(fp, dgram_show_line, f, AF_INET))
@@ -2919,7 +2921,7 @@ static int udp_show(struct filter *f)
 		fclose(fp);
 	}
 
-	if ((f->families&(1<<AF_INET6)) &&
+	if ((f->families&FAMILY_MASK(AF_INET6)) &&
 	    (fp = net_udp6_open()) != NULL) {
 		if (generic_record_read(fp, dgram_show_line, f, AF_INET6))
 			goto outerr;
@@ -2951,7 +2953,7 @@ static int raw_show(struct filter *f)
 	    inet_show_netlink(f, NULL, IPPROTO_RAW) == 0)
 		return 0;
 
-	if (f->families&(1<<AF_INET)) {
+	if (f->families&FAMILY_MASK(AF_INET)) {
 		if ((fp = net_raw_open()) == NULL)
 			goto outerr;
 		if (generic_record_read(fp, dgram_show_line, f, AF_INET))
@@ -2959,7 +2961,7 @@ static int raw_show(struct filter *f)
 		fclose(fp);
 	}
 
-	if ((f->families&(1<<AF_INET6)) &&
+	if ((f->families&FAMILY_MASK(AF_INET6)) &&
 	    (fp = net_raw6_open()) != NULL) {
 		if (generic_record_read(fp, dgram_show_line, f, AF_INET6))
 			goto outerr;
@@ -3703,13 +3705,13 @@ static int handle_follow_request(struct filter *f)
 	int groups = 0;
 	struct rtnl_handle rth;
 
-	if (f->families & (1 << AF_INET) && f->dbs & (1 << TCP_DB))
+	if (f->families & FAMILY_MASK(AF_INET) && f->dbs & (1 << TCP_DB))
 		groups |= 1 << (SKNLGRP_INET_TCP_DESTROY - 1);
-	if (f->families & (1 << AF_INET) && f->dbs & (1 << UDP_DB))
+	if (f->families & FAMILY_MASK(AF_INET) && f->dbs & (1 << UDP_DB))
 		groups |= 1 << (SKNLGRP_INET_UDP_DESTROY - 1);
-	if (f->families & (1 << AF_INET6) && f->dbs & (1 << TCP_DB))
+	if (f->families & FAMILY_MASK(AF_INET6) && f->dbs & (1 << TCP_DB))
 		groups |= 1 << (SKNLGRP_INET6_TCP_DESTROY - 1);
-	if (f->families & (1 << AF_INET6) && f->dbs & (1 << UDP_DB))
+	if (f->families & FAMILY_MASK(AF_INET6) && f->dbs & (1 << UDP_DB))
 		groups |= 1 << (SKNLGRP_INET6_UDP_DESTROY - 1);
 
 	if (groups == 0)
-- 
2.13.6

^ permalink raw reply related

* [PATCH iproute2 2/3] include: add <linux/vm_sockets_diag.h>
From: Stefan Hajnoczi @ 2017-10-03 17:57 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Jorgen Hansen, Dexuan Cui, Stefan Hajnoczi
In-Reply-To: <20171003175744.24987-1-stefanha@redhat.com>

This new Linux header file defines the sock_diag interface used by
AF_VSOCK.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/linux/vm_sockets_diag.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 include/linux/vm_sockets_diag.h

diff --git a/include/linux/vm_sockets_diag.h b/include/linux/vm_sockets_diag.h
new file mode 100644
index 00000000..14cd7dc5
--- /dev/null
+++ b/include/linux/vm_sockets_diag.h
@@ -0,0 +1,33 @@
+/* AF_VSOCK sock_diag(7) interface for querying open sockets */
+
+#ifndef _UAPI__VM_SOCKETS_DIAG_H__
+#define _UAPI__VM_SOCKETS_DIAG_H__
+
+#include <linux/types.h>
+
+/* Request */
+struct vsock_diag_req {
+	__u8	sdiag_family;	/* must be AF_VSOCK */
+	__u8	sdiag_protocol;	/* must be 0 */
+	__u16	pad;		/* must be 0 */
+	__u32	vdiag_states;	/* query bitmap (e.g. 1 << TCP_LISTEN) */
+	__u32	vdiag_ino;	/* must be 0 (reserved) */
+	__u32	vdiag_show;	/* must be 0 (reserved) */
+	__u32	vdiag_cookie[2];
+};
+
+/* Response */
+struct vsock_diag_msg {
+	__u8	vdiag_family;	/* AF_VSOCK */
+	__u8	vdiag_type;	/* SOCK_STREAM or SOCK_DGRAM */
+	__u8	vdiag_state;	/* sk_state (e.g. TCP_LISTEN) */
+	__u8	vdiag_shutdown; /* local RCV_SHUTDOWN | SEND_SHUTDOWN */
+	__u32   vdiag_src_cid;
+	__u32   vdiag_src_port;
+	__u32   vdiag_dst_cid;
+	__u32   vdiag_dst_port;
+	__u32	vdiag_ino;
+	__u32	vdiag_cookie[2];
+};
+
+#endif /* _UAPI__VM_SOCKETS_DIAG_H__ */
-- 
2.13.6

^ permalink raw reply related

* [PATCH iproute2 3/3] ss: add AF_VSOCK support
From: Stefan Hajnoczi @ 2017-10-03 17:57 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Jorgen Hansen, Dexuan Cui, Stefan Hajnoczi
In-Reply-To: <20171003175744.24987-1-stefanha@redhat.com>

The AF_VSOCK address family is a host<->guest communications channel
supported by VMware, KVM, and Hyper-V.  Initial VMware support was
released in Linux 3.9 in 2013 and transports for other hypervisors were
added later.

AF_VSOCK addresses are <u32 cid, u32 port> tuples.  The 32-bit cid
integer is comparable to an IP address.  AF_VSOCK ports work like
TCP/UDP ports.

Both SOCK_STREAM and SOCK_DGRAM socket types are available.

This patch adds AF_VSOCK support to ss(8) so that sockets can be
observed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 misc/ss.c     | 184 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 man/man8/ss.8 |   8 ++-
 2 files changed, 188 insertions(+), 4 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index 12a31c90..164356a0 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -44,6 +44,7 @@
 #include <linux/packet_diag.h>
 #include <linux/netlink_diag.h>
 #include <linux/sctp.h>
+#include <linux/vm_sockets_diag.h>
 
 #define MAGIC_SEQ 123456
 
@@ -126,6 +127,8 @@ enum {
 	PACKET_R_DB,
 	NETLINK_DB,
 	SCTP_DB,
+	VSOCK_ST_DB,
+	VSOCK_DG_DB,
 	MAX_DB
 };
 
@@ -134,6 +137,7 @@ enum {
 #define ALL_DB ((1<<MAX_DB)-1)
 #define INET_L4_DBM ((1<<TCP_DB)|(1<<UDP_DB)|(1<<DCCP_DB)|(1<<SCTP_DB))
 #define INET_DBM (INET_L4_DBM | (1<<RAW_DB))
+#define VSOCK_DBM ((1<<VSOCK_ST_DB)|(1<<VSOCK_DG_DB))
 
 enum {
 	SS_UNKNOWN,
@@ -222,6 +226,14 @@ static const struct filter default_dbs[MAX_DB] = {
 		.states   = SS_CONN,
 		.families = FAMILY_MASK(AF_INET) | FAMILY_MASK(AF_INET6),
 	},
+	[VSOCK_ST_DB] = {
+		.states   = SS_CONN,
+		.families = FAMILY_MASK(AF_VSOCK),
+	},
+	[VSOCK_DG_DB] = {
+		.states   = SS_CONN,
+		.families = FAMILY_MASK(AF_VSOCK),
+	},
 };
 
 static const struct filter default_afs[AF_MAX] = {
@@ -245,6 +257,10 @@ static const struct filter default_afs[AF_MAX] = {
 		.dbs    = (1 << NETLINK_DB),
 		.states = (1 << SS_CLOSE),
 	},
+	[AF_VSOCK] = {
+		.dbs    = VSOCK_DBM,
+		.states = SS_CONN,
+	},
 };
 
 static int do_default = 1;
@@ -283,6 +299,8 @@ static void filter_default_dbs(struct filter *f)
 	filter_db_set(f, PACKET_DG_DB);
 	filter_db_set(f, NETLINK_DB);
 	filter_db_set(f, SCTP_DB);
+	filter_db_set(f, VSOCK_ST_DB);
+	filter_db_set(f, VSOCK_DG_DB);
 }
 
 static void filter_states_set(struct filter *f, int states)
@@ -791,6 +809,18 @@ static const char *proto_name(int protocol)
 	return "???";
 }
 
+static const char *vsock_netid_name(int type)
+{
+	switch (type) {
+	case SOCK_STREAM:
+		return "v_str";
+	case SOCK_DGRAM:
+		return "v_dgr";
+	default:
+		return "???";
+	}
+}
+
 static void sock_state_print(struct sockstat *s)
 {
 	const char *sock_name;
@@ -823,6 +853,9 @@ static void sock_state_print(struct sockstat *s)
 	case AF_NETLINK:
 		sock_name = "nl";
 		break;
+	case AF_VSOCK:
+		sock_name = vsock_netid_name(s->type);
+		break;
 	default:
 		sock_name = "unknown";
 	}
@@ -1149,6 +1182,8 @@ static int run_ssfilter(struct ssfilter *f, struct sockstat *s)
 			return s->lport == 0 && s->local.data[0] == 0;
 		if (s->local.family == AF_NETLINK)
 			return s->lport < 0;
+		if (s->local.family == AF_VSOCK)
+			return s->lport > 1023;
 
 		return is_ephemeral(s->lport);
 	}
@@ -1524,6 +1559,15 @@ void *parse_devcond(char *name)
 	return res;
 }
 
+static void vsock_set_inet_prefix(inet_prefix *a, __u32 cid)
+{
+	*a = (inet_prefix){
+		.bytelen = sizeof(cid),
+		.family = AF_VSOCK,
+	};
+	memcpy(a->data, &cid, sizeof(cid));
+}
+
 void *parse_hostcond(char *addr, bool is_port)
 {
 	char *port = NULL;
@@ -1598,6 +1642,37 @@ void *parse_hostcond(char *addr, bool is_port)
 		goto out;
 	}
 
+	if (fam == AF_VSOCK || strncmp(addr, "vsock:", 6) == 0) {
+		__u32 cid = ~(__u32)0;
+
+		a.addr.family = AF_VSOCK;
+		if (strncmp(addr, "vsock:", 6) == 0)
+			addr += 6;
+
+		if (is_port)
+			port = addr;
+		else {
+			port = strchr(addr, ':');
+			if (port) {
+				*port = '\0';
+				port++;
+			}
+		}
+
+		if (port && strcmp(port, "*") &&
+		    get_u32((__u32 *)&a.port, port, 0))
+			return NULL;
+
+		if (addr[0] && strcmp(addr, "*")) {
+			a.addr.bitlen = 32;
+			if (get_u32(&cid, addr, 0))
+				return NULL;
+		}
+		vsock_set_inet_prefix(&a.addr, cid);
+		fam = AF_VSOCK;
+		goto out;
+	}
+
 	if (fam == AF_INET || !strncmp(addr, "inet:", 5)) {
 		fam = AF_INET;
 		if (!strncmp(addr, "inet:", 5))
@@ -3674,6 +3749,88 @@ static int netlink_show(struct filter *f)
 	return 0;
 }
 
+static bool vsock_type_skip(struct sockstat *s, struct filter *f)
+{
+	if (s->type == SOCK_STREAM && !(f->dbs & (1 << VSOCK_ST_DB)))
+		return true;
+	if (s->type == SOCK_DGRAM && !(f->dbs & (1 << VSOCK_DG_DB)))
+		return true;
+	return false;
+}
+
+static void vsock_addr_print(inet_prefix *a, __u32 port)
+{
+	char cid_str[sizeof("4294967295")];
+	char port_str[sizeof("4294967295")];
+	__u32 cid;
+
+	memcpy(&cid, a->data, sizeof(cid));
+
+	if (cid == ~(__u32)0)
+		snprintf(cid_str, sizeof(cid_str), "*");
+	else
+		snprintf(cid_str, sizeof(cid_str), "%u", cid);
+
+	if (port == ~(__u32)0)
+		snprintf(port_str, sizeof(port_str), "*");
+	else
+		snprintf(port_str, sizeof(port_str), "%u", port);
+
+	sock_addr_print(cid_str, ":", port_str, NULL);
+}
+
+static void vsock_stats_print(struct sockstat *s, struct filter *f)
+{
+	sock_state_print(s);
+
+	vsock_addr_print(&s->local, s->lport);
+	vsock_addr_print(&s->remote, s->rport);
+
+	proc_ctx_print(s);
+
+	printf("\n");
+}
+
+static int vsock_show_sock(const struct sockaddr_nl *addr,
+			   struct nlmsghdr *nlh, void *arg)
+{
+	struct filter *f = (struct filter *)arg;
+	struct vsock_diag_msg *r = NLMSG_DATA(nlh);
+	struct sockstat stat = {
+		.type = r->vdiag_type,
+		.lport = r->vdiag_src_port,
+		.rport = r->vdiag_dst_port,
+		.state = r->vdiag_state,
+		.ino = r->vdiag_ino,
+	};
+
+	vsock_set_inet_prefix(&stat.local, r->vdiag_src_cid);
+	vsock_set_inet_prefix(&stat.remote, r->vdiag_dst_cid);
+
+	if (vsock_type_skip(&stat, f))
+		return 0;
+
+	if (f->f && run_ssfilter(f->f, &stat) == 0)
+		return 0;
+
+	vsock_stats_print(&stat, f);
+
+	return 0;
+}
+
+static int vsock_show(struct filter *f)
+{
+	DIAG_REQUEST(req, struct vsock_diag_req r);
+
+	if (!filter_af_get(f, AF_VSOCK))
+		return 0;
+
+	req.r.sdiag_family = AF_VSOCK;
+	req.r.vdiag_states = f->states;
+
+	return handle_netlink_request(f, &req.nlh, sizeof(req), vsock_show_sock);
+}
+
 struct sock_diag_msg {
 	__u8 sdiag_family;
 };
@@ -3694,6 +3851,8 @@ static int generic_show_sock(const struct sockaddr_nl *addr,
 		return packet_show_sock(addr, nlh, arg);
 	case AF_NETLINK:
 		return netlink_show_sock(addr, nlh, arg);
+	case AF_VSOCK:
+		return vsock_show_sock(addr, nlh, arg);
 	default:
 		return -1;
 	}
@@ -3921,14 +4080,15 @@ static void _usage(FILE *dest)
 "   -d, --dccp          display only DCCP sockets\n"
 "   -w, --raw           display only RAW sockets\n"
 "   -x, --unix          display only Unix domain sockets\n"
+"       --vsock         display only vsock sockets\n"
 "   -f, --family=FAMILY display sockets of type FAMILY\n"
-"       FAMILY := {inet|inet6|link|unix|netlink|help}\n"
+"       FAMILY := {inet|inet6|link|unix|netlink|vsock|help}\n"
 "\n"
 "   -K, --kill          forcibly close sockets, display what was closed\n"
 "   -H, --no-header     Suppress header line\n"
 "\n"
 "   -A, --query=QUERY, --socket=QUERY\n"
-"       QUERY := {all|inet|tcp|udp|raw|unix|unix_dgram|unix_stream|unix_seqpacket|packet|netlink}[,QUERY]\n"
+"       QUERY := {all|inet|tcp|udp|raw|unix|unix_dgram|unix_stream|unix_seqpacket|packet|netlink|vsock_stream|vsock_dgram}[,QUERY]\n"
 "\n"
 "   -D, --diag=FILE     Dump raw information about TCP sockets to FILE\n"
 "   -F, --filter=FILE   read filter information from FILE\n"
@@ -4001,6 +4161,9 @@ static int scan_state(const char *state)
 	exit(-1);
 }
 
+/* Values 'v' and 'V' are already used so a non-character is used */
+#define OPT_VSOCK 256
+
 static const struct option long_opts[] = {
 	{ "numeric", 0, 0, 'n' },
 	{ "resolve", 0, 0, 'r' },
@@ -4017,6 +4180,7 @@ static const struct option long_opts[] = {
 	{ "udp", 0, 0, 'u' },
 	{ "raw", 0, 0, 'w' },
 	{ "unix", 0, 0, 'x' },
+	{ "vsock", 0, 0, OPT_VSOCK },
 	{ "all", 0, 0, 'a' },
 	{ "listening", 0, 0, 'l' },
 	{ "ipv4", 0, 0, '4' },
@@ -4102,6 +4266,9 @@ int main(int argc, char *argv[])
 		case 'x':
 			filter_af_set(&current_filter, AF_UNIX);
 			break;
+		case OPT_VSOCK:
+			filter_af_set(&current_filter, AF_VSOCK);
+			break;
 		case 'a':
 			state_filter = SS_ALL;
 			break;
@@ -4128,6 +4295,8 @@ int main(int argc, char *argv[])
 				filter_af_set(&current_filter, AF_UNIX);
 			else if (strcmp(optarg, "netlink") == 0)
 				filter_af_set(&current_filter, AF_NETLINK);
+			else if (strcmp(optarg, "vsock") == 0)
+				filter_af_set(&current_filter, AF_VSOCK);
 			else if (strcmp(optarg, "help") == 0)
 				help();
 			else {
@@ -4193,6 +4362,15 @@ int main(int argc, char *argv[])
 					filter_db_set(&current_filter, PACKET_DG_DB);
 				} else if (strcmp(p, "netlink") == 0) {
 					filter_db_set(&current_filter, NETLINK_DB);
+				} else if (strcmp(p, "vsock") == 0) {
+					filter_db_set(&current_filter, VSOCK_ST_DB);
+					filter_db_set(&current_filter, VSOCK_DG_DB);
+				} else if (strcmp(p, "vsock_stream") == 0 ||
+					   strcmp(p, "v_str") == 0) {
+					filter_db_set(&current_filter, VSOCK_ST_DB);
+				} else if (strcmp(p, "vsock_dgram") == 0 ||
+					   strcmp(p, "v_dgr") == 0) {
+					filter_db_set(&current_filter, VSOCK_DG_DB);
 				} else {
 					fprintf(stderr, "ss: \"%s\" is illegal socket table id\n", p);
 					usage();
@@ -4408,6 +4586,8 @@ int main(int argc, char *argv[])
 		dccp_show(&current_filter);
 	if (current_filter.dbs & (1<<SCTP_DB))
 		sctp_show(&current_filter);
+	if (current_filter.dbs & VSOCK_DBM)
+		vsock_show(&current_filter);
 
 	if (show_users || show_proc_ctx || show_sock_ctx)
 		user_ent_destroy();
diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 3bec97f0..3af509e9 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -125,14 +125,18 @@ Display Unix domain sockets (alias for -f unix).
 .B \-S, \-\-sctp
 Display SCTP sockets.
 .TP
+.B \-\-vsock
+Display vsock sockets (alias for -f vsock).
+.TP
 .B \-f FAMILY, \-\-family=FAMILY
 Display sockets of type FAMILY.
-Currently the following families are supported: unix, inet, inet6, link, netlink.
+Currently the following families are supported: unix, inet, inet6, link, netlink, vsock.
 .TP
 .B \-A QUERY, \-\-query=QUERY, \-\-socket=QUERY
 List of socket tables to dump, separated by commas. The following identifiers
 are understood: all, inet, tcp, udp, raw, unix, packet, netlink, unix_dgram,
-unix_stream, unix_seqpacket, packet_raw, packet_dgram, dccp, sctp.
+unix_stream, unix_seqpacket, packet_raw, packet_dgram, dccp, sctp,
+vsock_stream, vsock_dgram.
 .TP
 .B \-D FILE, \-\-diag=FILE
 Do not display anything, just dump raw information about TCP sockets to FILE after applying filters. If FILE is - stdout is used.
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH] net: phy: DP83822 initial driver submission
From: Dan Murphy @ 2017-10-03 18:03 UTC (permalink / raw)
  To: Florian Fainelli, andrew; +Cc: netdev
In-Reply-To: <1847c902-8364-3c7c-f079-3d123ba1d3f1@gmail.com>

Florian

Thanks for the review

On 10/03/2017 12:15 PM, Florian Fainelli wrote:
> On 10/03/2017 08:53 AM, Dan Murphy wrote:
>> Add support for the TI  DP83822 10/100Mbit ethernet phy.
>>
>> The DP83822 provides flexibility to connect to a MAC through a
>> standard MII, RMII or RGMII interface.
>>
>> Datasheet:
>> http://www.ti.com/product/DP83822I/datasheet
> 
> This looks pretty good, just a few nits below.
> 
> [snip]
> 
>> +static int dp83822_set_wol(struct phy_device *phydev,
>> +			   struct ethtool_wolinfo *wol)
>> +{
>> +	struct net_device *ndev = phydev->attached_dev;
>> +	u16 value;
>> +	const u8 *mac;
>> +
>> +	if (wol->wolopts & (WAKE_MAGIC | WAKE_MAGICSECURE)) {
>> +		mac = (const u8 *)ndev->dev_addr;
>> +
>> +		if (!is_valid_ether_addr(mac))
>> +			return -EFAULT;
> 
> -EINVAL maybe?

I referenced the at803x driver for the error code.  I can change it if you want it to be -EINVAL.
I can submit a patch to do the same for the other driver.

-EIVAL does make more sense.


> 
>> +
>> +		/* MAC addresses start with byte 5, but stored in mac[0].
>> +		 * 822 PHYs store bytes 4|5, 2|3, 0|1
>> +		 */
>> +		phy_write_mmd(phydev, DP83822_DEVADDR,
>> +			      MII_DP83822_WOL_DA1, (mac[1] << 8) | mac[0]);
>> +		phy_write_mmd(phydev, DP83822_DEVADDR,
>> +			      MII_DP83822_WOL_DA2, (mac[3] << 8) | mac[2]);
>> +		phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_DA3,
>> +			      (mac[5] << 8) | mac[4]);
>> +
>> +		value = phy_read_mmd(phydev, DP83822_DEVADDR,
>> +				     MII_DP83822_WOL_CFG);
>> +		if (wol->wolopts & WAKE_MAGIC)
>> +			value |= DP83822_WOL_MAGIC_EN;
>> +		else
>> +			value &= ~DP83822_WOL_MAGIC_EN;
>> +
>> +		if (wol->wolopts & WAKE_MAGICSECURE) {
>> +			value |= DP83822_WOL_SECURE_ON;
> 
> Just in case any of the writes below fail, you would probably want to
> set this bit last, thus indicating that the password was successfully set.

Good point

> 
>> +			phy_write_mmd(phydev, DP83822_DEVADDR,
>> +				      MII_DP83822_RXSOP1,
>> +				      (wol->sopass[1] << 8) | wol->sopass[0]);
>> +			phy_write_mmd(phydev, DP83822_DEVADDR,
>> +				      MII_DP83822_RXSOP2,
>> +				      (wol->sopass[3] << 8) | wol->sopass[2]);
>> +			phy_write_mmd(phydev, DP83822_DEVADDR,
>> +				      MII_DP83822_RXSOP3,
>> +				      (wol->sopass[5] << 8) | wol->sopass[4]);
> 
> In the else clause, you don't appear to be clearing the MagicPacket
> SecureOn password, but your get_wol function does not check for
> DP83822_WOL_SECURE_ON before returning the secure password, so either
> one of these two should be fixed. I would go with fixing the get_wol
> function see below.

OK

> 
>> +		} else {
>> +			value &= ~DP83822_WOL_SECURE_ON;
>> +		}
>> +
>> +		value |= (DP83822_WOL_EN | DP83822_WOL_CLR_INDICATION |
>> +			  DP83822_WOL_CLR_INDICATION);
> 
> The extra parenthesis should not be required here.

I did not code that in.  I had to add it after Checkpatch cribbed about it.
Let me know if you want me to remove it.

> 
>> +		phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG,
>> +			      value);
>> +	} else {
>> +		value =
>> +		    phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>> +		value &= (~DP83822_WOL_EN);
> 
> Same here, parenthesis should not be needed.

There are three lines of code in the else.  This code all needs to be excuted in the else case.
I might reformat it to read better.  Lindent messed that one up.

> 
>> +		phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG,
>> +			      value);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void dp83822_get_wol(struct phy_device *phydev,
>> +			    struct ethtool_wolinfo *wol)
>> +{
>> +	int value;
>> +
>> +	wol->supported = (WAKE_MAGIC | WAKE_MAGICSECURE);
>> +	wol->wolopts = 0;
>> +
>> +	value = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>> +	if (value & DP83822_WOL_MAGIC_EN)
>> +		wol->wolopts |= WAKE_MAGIC;
>> +
>> +	if (value & DP83822_WOL_SECURE_ON)
>> +		wol->wolopts |= WAKE_MAGICSECURE;
>> +
>> +	if (~value & DP83822_WOL_CLR_INDICATION)
>> +		wol->wolopts = 0;
>> +
>> +	wol->sopass[0] = (phy_read_mmd(phydev,
>> +				       DP83822_DEVADDR,
>> +				       MII_DP83822_RXSOP1) & 0xFF);
>> +	wol->sopass[1] =
>> +	    (phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_RXSOP1) >> 8);
> 
> You can save about twice the amount of reads by using a temporary
> variable to hold the 16-bit register ;)

You are right I can clean that up.

> 
>> +	wol->sopass[2] =
>> +	    (phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_RXSOP2) & 0xFF);
>> +	wol->sopass[3] =
>> +	    (phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_RXSOP2) >> 8);
>> +	wol->sopass[4] =
>> +	    (phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_RXSOP3) & 0xFF);
>> +	wol->sopass[5] =
>> +	    (phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_RXSOP3) >> 8);
> 
> Unless DP83822_WOL_SECURE_ON is set, you probably should not try reading
> the password at all, because there is no guarantee it has been correctly
> set.
> 

OK

>> +}
> 
>> +static int dp83822_phy_reset(struct phy_device *phydev)
>> +{
>> +	int err;
>> +
>> +	err = phy_write(phydev, MII_DP83822_RESET_CTRL, DP83822_HW_RESET);
>> +	if (err < 0)
>> +		return err;
>> +
>> +	return 0;
>> +}
>> +
>> +static int dp83822_suspend(struct phy_device *phydev)
>> +{
>> +	int value;
>> +
>> +	mutex_lock(&phydev->lock);
>> +
>> +	value = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>> +	if (~value & DP83822_WOL_EN) {
>> +		value = phy_read(phydev, MII_BMCR);
>> +		phy_write(phydev, MII_BMCR, value | BMCR_PDOWN);
>> +	}
> 
> Can you use genphy_suspend() here along with careful locking of course?
> 

genphy_suspend does not have WoL.

>> +
>> +	mutex_unlock(&phydev->lock);
>> +
>> +	return 0;
>> +}
>> +
>> +static int dp83822_resume(struct phy_device *phydev)
>> +{
>> +	int value;
>> +
>> +	mutex_lock(&phydev->lock);
>> +
>> +	value = phy_read(phydev, MII_BMCR);
>> +	phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
> 
> And genphy_resume() here as well?

genphy_resume does not have WoL.

> 
>> +
>> +	value = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>> +
>> +	phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG, value |
>> +		      DP83822_WOL_CLR_INDICATION);
>> +
>> +	mutex_unlock(&phydev->lock);
>> +
>> +	return 0;
>> +}
>> +
>> +static struct phy_driver dp83822_driver[] = {
>> +	{
>> +	 .phy_id = DP83822_PHY_ID,
>> +	 .phy_id_mask = 0xfffffff0,
>> +	 .name = "TI DP83822",
>> +	 .features = PHY_BASIC_FEATURES,
>> +	 .flags = PHY_HAS_INTERRUPT,
>> +
>> +	 .config_init = genphy_config_init,
>> +	 .soft_reset = dp83822_phy_reset,
>> +
>> +	 .get_wol = dp83822_get_wol,
>> +	 .set_wol = dp83822_set_wol,
>> +
>> +	 /* IRQ related */
>> +	 .ack_interrupt = dp83822_ack_interrupt,
>> +	 .config_intr = dp83822_config_intr,
>> +
>> +	 .config_aneg = genphy_config_aneg,
>> +	 .read_status = genphy_read_status,
>> +	 .suspend = dp83822_suspend,
>> +	 .resume = dp83822_resume,
>> +	 },
> 
> I would omit newlines between definitions of callbacks, but this is
> really a personal preference. Unless you are planning on adding new IDs,
> you could also avoid using an array of 1 element and just a plain
> phy_driver structure, but that's not a big deal either.

Yes there is a plan to add another phy id in early 2018 to this driver.

> 


-- 
------------------
Dan Murphy

^ permalink raw reply

* Re: [PATCH v2 net-next 06/12] qed: Add LL2 slowpath handling
From: Kalderon, Michal @ 2017-10-03 18:05 UTC (permalink / raw)
  To: David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Elior, Ariel
In-Reply-To: <20171003.101712.715882117516958741.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

From: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Sent: Tuesday, October 3, 2017 8:17 PM
>> @@ -423,6 +423,41 @@ static void qed_ll2_rxq_parse_reg(struct qed_hwfn *p_hwfn,
>>  }
>>
>>  static int
>> +qed_ll2_handle_slowpath(struct qed_hwfn *p_hwfn,
>> +                     struct qed_ll2_info *p_ll2_conn,
>> +                     union core_rx_cqe_union *p_cqe,
>> +                     unsigned long *p_lock_flags)
>> +{
>...
>> +     spin_unlock_irqrestore(&p_rx->lock, *p_lock_flags);
>> +
>
>You can't drop this lock.
>
>Another thread can enter the loop of our caller and process RX queue
>entries, then we would return from here and try to process the same
>entries again.

The lock is there to synchronize access to chains between qed_ll2_rxq_completion
and qed_ll2_post_rx_buffer. qed_ll2_rxq_completion can't be called from
different threads, the light l2 uses the single sp status block we have.
The reason we release the lock is to avoid a deadlock where as a result of calling
upper-layer driver it will potentially post additional rx-buffers.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 2/2] flow_dissector: dissect tunnel info
From: Tom Herbert @ 2017-10-03 18:17 UTC (permalink / raw)
  To: Simon Horman
  Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang,
	Linux Kernel Network Developers, oss-drivers
In-Reply-To: <20171003094052.GA20592@netronome.com>

On Tue, Oct 3, 2017 at 2:40 AM, Simon Horman <simon.horman@netronome.com> wrote:
> On Mon, Oct 02, 2017 at 01:37:55PM -0700, Tom Herbert wrote:
>> On Mon, Oct 2, 2017 at 1:41 AM, Simon Horman <simon.horman@netronome.com> wrote:
>> > Move dissection of tunnel info from the flower classifier to the flow
>> > dissector where all other dissection occurs.  This should not have any
>> > behavioural affect on other users of the flow dissector.
>
> ...

> I feel that we are circling back the perennial issue of flower using the
> flow dissector in a somewhat broader/different way than many/all other
> users of the flow dissector.
>
Simon,

It's more like __skb_flow_dissect is already an incredibly complex
function and because of that it's difficult to maintain. We need to
measure changes against that fact. For this patch, there is precisely
one user (cls_flower.c) and it's not at all clear to me if there will
be ever any more (e.g. for hashing we don't need tunnel info). IMO, it
should be just as easy and less convolution for everyone to have
flower call __skb_flow_dissect_tunnel_info directly and not call if
from __skb_flow_dissect.

Tom

^ permalink raw reply

* [PATCH net-next v2 0/3] bridge: neigh msg proxy and flood suppression support
From: Roopa Prabhu @ 2017-10-03 18:21 UTC (permalink / raw)
  To: davem; +Cc: netdev, nikolay, stephen, bridge

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This series implements arp and nd suppression in the bridge
driver for ethernet vpns. It implements rfc7432, section 10
https://tools.ietf.org/html/rfc7432#section-10
for ethernet VPN deployments. It is similar to the existing
BR_ARP_PROXY flag but has a few semantic differences to conform
to EVPN standard. In case of EVPN, it is mainly used to avoid flooding to
tunnel ports like vxlan. Unlike the existing flags it suppresses flood
of all neigh discovery packets (arp, nd) to tunnel ports.

v2 : rebase to latest + address some optimization feedback from Nikolay.

Roopa Prabhu (3):
  bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd
    flood
  neigh arp suppress first
  bridge: suppress nd messages from going to BR_NEIGH_SUPPRESS ports

 include/linux/if_bridge.h    |   1 +
 include/uapi/linux/if_link.h |   1 +
 net/bridge/Makefile          |   2 +-
 net/bridge/br_arp_nd_proxy.c | 492 +++++++++++++++++++++++++++++++++++++++++++
 net/bridge/br_device.c       |  18 ++
 net/bridge/br_forward.c      |   3 +-
 net/bridge/br_if.c           |   5 +
 net/bridge/br_input.c        |  73 ++-----
 net/bridge/br_netlink.c      |  16 +-
 net/bridge/br_private.h      |   9 +
 net/bridge/br_sysfs_if.c     |   2 +
 11 files changed, 561 insertions(+), 61 deletions(-)
 create mode 100644 net/bridge/br_arp_nd_proxy.c

-- 
2.1.4

^ permalink raw reply

* [PATCH net-next v2 1/3] bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
From: Roopa Prabhu @ 2017-10-03 18:21 UTC (permalink / raw)
  To: davem; +Cc: netdev, nikolay, stephen, bridge
In-Reply-To: <1507054876-16746-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This patch adds a new bridge port flag BR_NEIGH_SUPPRESS to
suppress arp and nd flood on bridge ports. It implements
rfc7432, section 10.
https://tools.ietf.org/html/rfc7432#section-10
for ethernet VPN deployments. It is similar to the existing
BR_ARP_PROXY flag but has a few semantic differences to conform
to EVPN standard. In case of EVPN, it is mainly used to
avoid flooding to tunnel ports like vxlan. Unlike the existing
flags it suppresses flood of all neigh discovery packets
(arp, nd) to tunnel ports.

This patch adds netlink and sysfs support to set this bridge port
flag.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 include/linux/if_bridge.h    |  1 +
 include/uapi/linux/if_link.h |  1 +
 net/bridge/Makefile          |  2 +-
 net/bridge/br_arp_nd_proxy.c | 32 ++++++++++++++++++++++++++++++++
 net/bridge/br_forward.c      |  3 ++-
 net/bridge/br_if.c           |  5 +++++
 net/bridge/br_netlink.c      | 12 +++++++++++-
 net/bridge/br_private.h      |  2 ++
 net/bridge/br_sysfs_if.c     |  2 ++
 9 files changed, 57 insertions(+), 3 deletions(-)
 create mode 100644 net/bridge/br_arp_nd_proxy.c

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 3cd18ac..316ee11 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -49,6 +49,7 @@ struct br_ip_list {
 #define BR_MULTICAST_TO_UNICAST	BIT(12)
 #define BR_VLAN_TUNNEL		BIT(13)
 #define BR_BCAST_FLOOD		BIT(14)
+#define BR_NEIGH_SUPPRESS	BIT(15)
 
 #define BR_DEFAULT_AGEING_TIME	(300 * HZ)
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index ea87bd7..0d51f4f 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -326,6 +326,7 @@ enum {
 	IFLA_BRPORT_VLAN_TUNNEL,
 	IFLA_BRPORT_BCAST_FLOOD,
 	IFLA_BRPORT_GROUP_FWD_MASK,
+	IFLA_BRPORT_NEIGH_SUPPRESS,
 	__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/Makefile b/net/bridge/Makefile
index 40b1ede..4aee55f 100644
--- a/net/bridge/Makefile
+++ b/net/bridge/Makefile
@@ -7,7 +7,7 @@ obj-$(CONFIG_BRIDGE) += bridge.o
 bridge-y	:= br.o br_device.o br_fdb.o br_forward.o br_if.o br_input.o \
 			br_ioctl.o br_stp.o br_stp_bpdu.o \
 			br_stp_if.o br_stp_timer.o br_netlink.o \
-			br_netlink_tunnel.o
+			br_netlink_tunnel.o br_arp_nd_proxy.o
 
 bridge-$(CONFIG_SYSFS) += br_sysfs_if.o br_sysfs_br.o
 
diff --git a/net/bridge/br_arp_nd_proxy.c b/net/bridge/br_arp_nd_proxy.c
new file mode 100644
index 0000000..f889ad5
--- /dev/null
+++ b/net/bridge/br_arp_nd_proxy.c
@@ -0,0 +1,32 @@
+/*
+ *  Handle bridge arp/nd proxy/suppress
+ *
+ *  Copyright (C) 2017 Cumulus Networks
+ *  Copyright (c) 2017 Roopa Prabhu <roopa@cumulusnetworks.com>
+ *
+ *  Authors:
+ *	Roopa Prabhu <roopa@cumulusnetworks.com>
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include "br_private.h"
+
+void br_recalculate_neigh_suppress_enabled(struct net_bridge *br)
+{
+	struct net_bridge_port *p;
+	bool neigh_suppress = false;
+
+	list_for_each_entry(p, &br->port_list, list) {
+		if (p->flags & BR_NEIGH_SUPPRESS) {
+			neigh_suppress = true;
+			break;
+		}
+	}
+
+	br->neigh_suppress_enabled = neigh_suppress;
+}
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 48fb174..7a50dc5 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -204,7 +204,8 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb,
 		/* Do not flood to ports that enable proxy ARP */
 		if (p->flags & BR_PROXYARP)
 			continue;
-		if ((p->flags & BR_PROXYARP_WIFI) &&
+		if ((p->flags & BR_PROXYARP_WIFI ||
+		     p->flags & BR_NEIGH_SUPPRESS) &&
 		    BR_INPUT_SKB_CB(skb)->proxyarp_replied)
 			continue;
 
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index f3aef22..8f615d4 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -310,6 +310,8 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
 		del_nbp(p);
 	}
 
+	br_recalculate_neigh_suppress_enabled(br);
+
 	br_fdb_delete_by_port(br, NULL, 0, 1);
 
 	cancel_delayed_work_sync(&br->gc_work);
@@ -653,4 +655,7 @@ void br_port_flags_change(struct net_bridge_port *p, unsigned long mask)
 
 	if (mask & BR_AUTO_MASK)
 		nbp_update_port_count(br);
+
+	if (mask & BR_NEIGH_SUPPRESS)
+		br_recalculate_neigh_suppress_enabled(br);
 }
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index dea88a2..d8c2706 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -138,6 +138,7 @@ static inline size_t br_port_info_size(void)
 		+ nla_total_size(1)	/* IFLA_BRPORT_PROXYARP */
 		+ nla_total_size(1)	/* IFLA_BRPORT_PROXYARP_WIFI */
 		+ nla_total_size(1)	/* IFLA_BRPORT_VLAN_TUNNEL */
+		+ nla_total_size(1)	/* IFLA_BRPORT_NEIGH_SUPPRESS */
 		+ nla_total_size(sizeof(struct ifla_bridge_id))	/* IFLA_BRPORT_ROOT_ID */
 		+ nla_total_size(sizeof(struct ifla_bridge_id))	/* IFLA_BRPORT_BRIDGE_ID */
 		+ nla_total_size(sizeof(u16))	/* IFLA_BRPORT_DESIGNATED_PORT */
@@ -210,7 +211,9 @@ static int br_port_fill_attrs(struct sk_buff *skb,
 	    nla_put_u8(skb, IFLA_BRPORT_CONFIG_PENDING, p->config_pending) ||
 	    nla_put_u8(skb, IFLA_BRPORT_VLAN_TUNNEL, !!(p->flags &
 							BR_VLAN_TUNNEL)) ||
-	    nla_put_u16(skb, IFLA_BRPORT_GROUP_FWD_MASK, p->group_fwd_mask))
+	    nla_put_u16(skb, IFLA_BRPORT_GROUP_FWD_MASK, p->group_fwd_mask) ||
+	    nla_put_u8(skb, IFLA_BRPORT_NEIGH_SUPPRESS, !!(p->flags &
+							BR_NEIGH_SUPPRESS)))
 		return -EMSGSIZE;
 
 	timerval = br_timer_value(&p->message_age_timer);
@@ -692,6 +695,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 {
 	unsigned long old_flags = p->flags;
 	bool br_vlan_tunnel_old = false;
+	int neigh_suppress_old = 0;
 	int err;
 
 	err = br_set_port_flag(p, tb, IFLA_BRPORT_MODE, BR_HAIRPIN_MODE);
@@ -785,6 +789,12 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 		p->group_fwd_mask = fwd_mask;
 	}
 
+	neigh_suppress_old = (p->flags & BR_NEIGH_SUPPRESS);
+	br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS,
+			 BR_NEIGH_SUPPRESS);
+	if (neigh_suppress_old != (p->flags & BR_NEIGH_SUPPRESS))
+		br_recalculate_neigh_suppress_enabled(p->br);
+
 	br_port_flags_change(p, old_flags ^ p->flags);
 	return 0;
 }
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 020c709..f47332e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -404,6 +404,7 @@ struct net_bridge {
 #ifdef CONFIG_NET_SWITCHDEV
 	int offload_fwd_mark;
 #endif
+	bool				neigh_suppress_enabled;
 };
 
 struct br_input_skb_cb {
@@ -1138,4 +1139,5 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
 }
 #endif /* CONFIG_NET_SWITCHDEV */
 
+void br_recalculate_neigh_suppress_enabled(struct net_bridge *br);
 #endif
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 9110d5e..0a1fa9c 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -191,6 +191,7 @@ BRPORT_ATTR_FLAG(proxyarp, BR_PROXYARP);
 BRPORT_ATTR_FLAG(proxyarp_wifi, BR_PROXYARP_WIFI);
 BRPORT_ATTR_FLAG(multicast_flood, BR_MCAST_FLOOD);
 BRPORT_ATTR_FLAG(broadcast_flood, BR_BCAST_FLOOD);
+BRPORT_ATTR_FLAG(neigh_suppress, BR_NEIGH_SUPPRESS);
 
 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
 static ssize_t show_multicast_router(struct net_bridge_port *p, char *buf)
@@ -241,6 +242,7 @@ static const struct brport_attribute *brport_attrs[] = {
 	&brport_attr_multicast_flood,
 	&brport_attr_broadcast_flood,
 	&brport_attr_group_fwd_mask,
+	&brport_attr_neigh_suppress,
 	NULL
 };
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH net-next v2 2/3] bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports
From: Roopa Prabhu @ 2017-10-03 18:21 UTC (permalink / raw)
  To: davem; +Cc: netdev, nikolay, stephen, bridge
In-Reply-To: <1507054876-16746-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This patch avoids flooding and proxies arp packets
for BR_NEIGH_SUPPRESS ports.

Moves existing br_do_proxy_arp to br_do_proxy_suppress_arp
to support both proxy arp and neigh suppress.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 net/bridge/br_arp_nd_proxy.c | 184 +++++++++++++++++++++++++++++++++++++++++++
 net/bridge/br_device.c       |   9 +++
 net/bridge/br_input.c        |  63 ++-------------
 net/bridge/br_private.h      |   3 +
 4 files changed, 201 insertions(+), 58 deletions(-)

diff --git a/net/bridge/br_arp_nd_proxy.c b/net/bridge/br_arp_nd_proxy.c
index f889ad5..1177547 100644
--- a/net/bridge/br_arp_nd_proxy.c
+++ b/net/bridge/br_arp_nd_proxy.c
@@ -14,6 +14,13 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/neighbour.h>
+#include <net/arp.h>
+#include <linux/if_vlan.h>
+#include <linux/inetdevice.h>
+#include <net/addrconf.h>
 #include "br_private.h"
 
 void br_recalculate_neigh_suppress_enabled(struct net_bridge *br)
@@ -30,3 +37,180 @@ void br_recalculate_neigh_suppress_enabled(struct net_bridge *br)
 
 	br->neigh_suppress_enabled = neigh_suppress;
 }
+
+static void br_arp_send(struct net_bridge_port *p, int type, int ptype,
+			__be32 dest_ip, struct net_device *dev,
+			__be32 src_ip, const unsigned char *dest_hw,
+			const unsigned char *src_hw,
+			const unsigned char *target_hw,
+			__be16 vlan_proto, u16 vlan_tci)
+{
+	struct sk_buff *skb;
+
+	netdev_dbg(dev, "arp send dev %s dst %pI4 dst_hw %pM src %pI4 src_hw %pM\n",
+		   dev->name, &dest_ip, dest_hw, &src_ip, src_hw);
+
+	if (!vlan_tci) {
+		arp_send(type, ptype, dest_ip, dev, src_ip,
+			 dest_hw, src_hw, target_hw);
+		return;
+	}
+
+	skb = arp_create(type, ptype, dest_ip, dev, src_ip,
+			 dest_hw, src_hw, target_hw);
+	if (!skb)
+		return;
+
+	if (p) {
+		struct net_bridge_vlan_group *vg;
+		u16 pvid;
+
+		vg = nbp_vlan_group_rcu(p);
+		pvid = br_get_pvid(vg);
+		if (pvid == vlan_tci)
+			vlan_tci = 0;
+	}
+
+	if (vlan_tci) {
+		skb = vlan_insert_tag_set_proto(skb, vlan_proto,
+						vlan_tci);
+		if (!skb) {
+			net_err_ratelimited("%s: failed to insert VLAN tag\n",
+					    __func__);
+			return;
+		}
+	}
+
+	arp_xmit(skb);
+}
+
+static int br_chk_addr_ip(struct net_device *dev, void *data)
+{
+	__be32 ip = *(__be32 *)data;
+	struct in_device *in_dev;
+	__be32 addr = 0;
+
+	in_dev = __in_dev_get_rcu(dev);
+	if (in_dev)
+		addr = inet_confirm_addr(dev_net(dev), in_dev, 0, ip,
+					 RT_SCOPE_HOST);
+
+	if (addr == ip)
+		return 1;
+
+	return 0;
+}
+
+static bool br_is_local_ip(struct net_device *dev, __be32 ip)
+{
+	if (br_chk_addr_ip(dev, &ip))
+		return true;
+
+	/* check if ip is configured on upper dev */
+	if (netdev_walk_all_upper_dev_rcu(dev, br_chk_addr_ip, &ip))
+		return true;
+
+	return false;
+}
+
+void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
+			      u16 vid, struct net_bridge_port *p)
+{
+	struct net_device *dev = br->dev;
+	struct net_device *vlandev = dev;
+	struct neighbour *n;
+	struct arphdr *parp;
+	u8 *arpptr, *sha;
+	__be32 sip, tip;
+
+	BR_INPUT_SKB_CB(skb)->proxyarp_replied = false;
+
+	if ((dev->flags & IFF_NOARP) ||
+	    !pskb_may_pull(skb, arp_hdr_len(dev)))
+		return;
+
+	parp = arp_hdr(skb);
+
+	if (parp->ar_pro != htons(ETH_P_IP) ||
+	    parp->ar_hln != dev->addr_len ||
+	    parp->ar_pln != 4)
+		return;
+
+	arpptr = (u8 *)parp + sizeof(struct arphdr);
+	sha = arpptr;
+	arpptr += dev->addr_len;	/* sha */
+	memcpy(&sip, arpptr, sizeof(sip));
+	arpptr += sizeof(sip);
+	arpptr += dev->addr_len;	/* tha */
+	memcpy(&tip, arpptr, sizeof(tip));
+
+	if (ipv4_is_loopback(tip) ||
+	    ipv4_is_multicast(tip))
+		return;
+
+	if (br->neigh_suppress_enabled) {
+		if (p && (p->flags & BR_NEIGH_SUPPRESS))
+			return;
+		if (ipv4_is_zeronet(sip) || sip == tip) {
+			/* prevent flooding to neigh suppress ports */
+			BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+			return;
+		}
+	}
+
+	if (parp->ar_op != htons(ARPOP_REQUEST))
+		return;
+
+	if (vid != 0) {
+		vlandev = __vlan_find_dev_deep_rcu(br->dev, skb->vlan_proto,
+						   vid);
+		if (!vlandev)
+			return;
+	}
+
+	if (br->neigh_suppress_enabled && br_is_local_ip(vlandev, tip)) {
+		/* its our local ip, so don't proxy reply
+		 * and don't forward to neigh suppress ports
+		 */
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		return;
+	}
+
+	n = neigh_lookup(&arp_tbl, &tip, vlandev);
+	if (n) {
+		struct net_bridge_fdb_entry *f;
+
+		if (!(n->nud_state & NUD_VALID)) {
+			neigh_release(n);
+			return;
+		}
+
+		f = br_fdb_find_rcu(br, n->ha, vid);
+		if (f) {
+			bool replied = false;
+
+			if ((p && (p->flags & BR_PROXYARP)) ||
+			    (f->dst && ((f->dst->flags & BR_PROXYARP_WIFI) ||
+				       (f->dst->flags & BR_NEIGH_SUPPRESS)))) {
+				if (!vid)
+					br_arp_send(p, ARPOP_REPLY, ETH_P_ARP,
+						    sip, skb->dev, tip, sha,
+						    n->ha, sha, 0, 0);
+				else
+					br_arp_send(p, ARPOP_REPLY, ETH_P_ARP,
+						    sip, skb->dev, tip, sha,
+						    n->ha, sha, skb->vlan_proto,
+						    skb_vlan_tag_get(skb));
+				replied = true;
+			}
+
+			/* If we have replied or as long as we know the
+			 * mac, indicate to arp replied
+			 */
+			if (replied || br->neigh_suppress_enabled)
+				BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		}
+
+		neigh_release(n);
+	}
+}
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index f6b6a92..53d1456 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -39,6 +39,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct pcpu_sw_netstats *brstats = this_cpu_ptr(br->stats);
 	const struct nf_br_ops *nf_ops;
 	const unsigned char *dest;
+	struct ethhdr *eth;
 	u16 vid = 0;
 
 	rcu_read_lock();
@@ -57,11 +58,19 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	BR_INPUT_SKB_CB(skb)->brdev = dev;
 
 	skb_reset_mac_header(skb);
+	eth = eth_hdr(skb);
 	skb_pull(skb, ETH_HLEN);
 
 	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid))
 		goto out;
 
+	if (IS_ENABLED(CONFIG_INET) &&
+	    (eth->h_proto == htons(ETH_P_ARP) ||
+	     eth->h_proto == htons(ETH_P_RARP)) &&
+	    br->neigh_suppress_enabled) {
+		br_do_proxy_suppress_arp(skb, br, vid, NULL);
+	}
+
 	dest = eth_hdr(skb)->h_dest;
 	if (is_broadcast_ether_addr(dest)) {
 		br_flood(br, skb, BR_PKT_BROADCAST, false, true);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 7cb6137..4b8d2ec 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -71,62 +71,6 @@ static int br_pass_frame_up(struct sk_buff *skb)
 		       br_netif_receive_skb);
 }
 
-static void br_do_proxy_arp(struct sk_buff *skb, struct net_bridge *br,
-			    u16 vid, struct net_bridge_port *p)
-{
-	struct net_device *dev = br->dev;
-	struct neighbour *n;
-	struct arphdr *parp;
-	u8 *arpptr, *sha;
-	__be32 sip, tip;
-
-	BR_INPUT_SKB_CB(skb)->proxyarp_replied = false;
-
-	if ((dev->flags & IFF_NOARP) ||
-	    !pskb_may_pull(skb, arp_hdr_len(dev)))
-		return;
-
-	parp = arp_hdr(skb);
-
-	if (parp->ar_pro != htons(ETH_P_IP) ||
-	    parp->ar_op != htons(ARPOP_REQUEST) ||
-	    parp->ar_hln != dev->addr_len ||
-	    parp->ar_pln != 4)
-		return;
-
-	arpptr = (u8 *)parp + sizeof(struct arphdr);
-	sha = arpptr;
-	arpptr += dev->addr_len;	/* sha */
-	memcpy(&sip, arpptr, sizeof(sip));
-	arpptr += sizeof(sip);
-	arpptr += dev->addr_len;	/* tha */
-	memcpy(&tip, arpptr, sizeof(tip));
-
-	if (ipv4_is_loopback(tip) ||
-	    ipv4_is_multicast(tip))
-		return;
-
-	n = neigh_lookup(&arp_tbl, &tip, dev);
-	if (n) {
-		struct net_bridge_fdb_entry *f;
-
-		if (!(n->nud_state & NUD_VALID)) {
-			neigh_release(n);
-			return;
-		}
-
-		f = br_fdb_find_rcu(br, n->ha, vid);
-		if (f && ((p->flags & BR_PROXYARP) ||
-			  (f->dst && (f->dst->flags & BR_PROXYARP_WIFI)))) {
-			arp_send(ARPOP_REPLY, ETH_P_ARP, sip, skb->dev, tip,
-				 sha, n->ha, sha);
-			BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
-		}
-
-		neigh_release(n);
-	}
-}
-
 /* note: already called with rcu_read_lock */
 int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
@@ -171,8 +115,11 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 
 	BR_INPUT_SKB_CB(skb)->brdev = br->dev;
 
-	if (IS_ENABLED(CONFIG_INET) && skb->protocol == htons(ETH_P_ARP))
-		br_do_proxy_arp(skb, br, vid, p);
+	if (IS_ENABLED(CONFIG_INET) &&
+	    (skb->protocol == htons(ETH_P_ARP) ||
+	     skb->protocol == htons(ETH_P_RARP))) {
+		br_do_proxy_suppress_arp(skb, br, vid, p);
+	}
 
 	switch (pkt_type) {
 	case BR_PKT_MULTICAST:
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index f47332e..bb095dc 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -1139,5 +1139,8 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
 }
 #endif /* CONFIG_NET_SWITCHDEV */
 
+/* br_arp_nd_proxy.c */
 void br_recalculate_neigh_suppress_enabled(struct net_bridge *br);
+void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
+			      u16 vid, struct net_bridge_port *p);
 #endif
-- 
2.1.4

^ permalink raw reply related

* [PATCH net-next v2 3/3] bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports
From: Roopa Prabhu @ 2017-10-03 18:21 UTC (permalink / raw)
  To: davem; +Cc: netdev, nikolay, stephen, bridge
In-Reply-To: <1507054876-16746-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This patch avoids flooding and proxies ndisc packets
for BR_NEIGH_SUPPRESS ports.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 net/bridge/br_arp_nd_proxy.c | 246 +++++++++++++++++++++++++++++++++++++++++++
 net/bridge/br_device.c       |  11 ++
 net/bridge/br_input.c        |  10 ++
 net/bridge/br_private.h      |   3 +
 4 files changed, 270 insertions(+)

diff --git a/net/bridge/br_arp_nd_proxy.c b/net/bridge/br_arp_nd_proxy.c
index 1177547..2ae8ea3 100644
--- a/net/bridge/br_arp_nd_proxy.c
+++ b/net/bridge/br_arp_nd_proxy.c
@@ -214,3 +214,249 @@ void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 		neigh_release(n);
 	}
 }
+
+#if IS_ENABLED(CONFIG_IPV6)
+struct nd_msg *br_is_nd_neigh_msg(struct sk_buff *skb, struct nd_msg *msg)
+{
+	struct nd_msg *m;
+
+	m = skb_header_pointer(skb, skb_network_offset(skb) +
+			       sizeof(struct ipv6hdr), sizeof(*msg), msg);
+	if (!m)
+		return NULL;
+
+	if (m->icmph.icmp6_code != 0 ||
+	    (m->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION &&
+	     m->icmph.icmp6_type != NDISC_NEIGHBOUR_ADVERTISEMENT))
+		return NULL;
+
+	return m;
+}
+
+static void br_nd_send(struct net_bridge_port *p, struct sk_buff *request,
+		       struct neighbour *n, __be16 vlan_proto, u16 vlan_tci,
+		       struct nd_msg *ns)
+{
+	struct net_device *dev = request->dev;
+	struct sk_buff *reply;
+	struct nd_msg *na;
+	struct ipv6hdr *pip6;
+	u8 *daddr;
+	int na_olen = 8; /* opt hdr + ETH_ALEN for target */
+	int ns_olen;
+	int i, len;
+
+	if (!dev)
+		return;
+
+	len = LL_RESERVED_SPACE(dev) + sizeof(struct ipv6hdr) +
+		sizeof(*na) + na_olen + dev->needed_tailroom;
+
+	reply = alloc_skb(len, GFP_ATOMIC);
+	if (!reply)
+		return;
+
+	reply->protocol = htons(ETH_P_IPV6);
+	reply->dev = dev;
+	skb_reserve(reply, LL_RESERVED_SPACE(dev));
+	skb_push(reply, sizeof(struct ethhdr));
+	skb_set_mac_header(reply, 0);
+
+	daddr = eth_hdr(request)->h_source;
+
+	/* Do we need option processing ? */
+	ns_olen = request->len - (skb_network_offset(request) +
+				  sizeof(struct ipv6hdr)) - sizeof(*ns);
+	for (i = 0; i < ns_olen - 1; i += (ns->opt[i + 1] << 3)) {
+		if (ns->opt[i] == ND_OPT_SOURCE_LL_ADDR) {
+			daddr = ns->opt + i + sizeof(struct nd_opt_hdr);
+			break;
+		}
+	}
+
+	/* Ethernet header */
+	ether_addr_copy(eth_hdr(reply)->h_dest, daddr);
+	ether_addr_copy(eth_hdr(reply)->h_source, n->ha);
+	eth_hdr(reply)->h_proto = htons(ETH_P_IPV6);
+	reply->protocol = htons(ETH_P_IPV6);
+
+	skb_pull(reply, sizeof(struct ethhdr));
+	skb_set_network_header(reply, 0);
+	skb_put(reply, sizeof(struct ipv6hdr));
+
+	/* IPv6 header */
+	pip6 = ipv6_hdr(reply);
+	memset(pip6, 0, sizeof(struct ipv6hdr));
+	pip6->version = 6;
+	pip6->priority = ipv6_hdr(request)->priority;
+	pip6->nexthdr = IPPROTO_ICMPV6;
+	pip6->hop_limit = 255;
+	pip6->daddr = ipv6_hdr(request)->saddr;
+	pip6->saddr = *(struct in6_addr *)n->primary_key;
+
+	skb_pull(reply, sizeof(struct ipv6hdr));
+	skb_set_transport_header(reply, 0);
+
+	na = (struct nd_msg *)skb_put(reply, sizeof(*na) + na_olen);
+
+	/* Neighbor Advertisement */
+	memset(na, 0, sizeof(*na) + na_olen);
+	na->icmph.icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT;
+	na->icmph.icmp6_router = 0; /* XXX: should be 1 ? */
+	na->icmph.icmp6_override = 1;
+	na->icmph.icmp6_solicited = 1;
+	na->target = ns->target;
+	ether_addr_copy(&na->opt[2], n->ha);
+	na->opt[0] = ND_OPT_TARGET_LL_ADDR;
+	na->opt[1] = na_olen >> 3;
+
+	na->icmph.icmp6_cksum = csum_ipv6_magic(&pip6->saddr,
+						&pip6->daddr,
+						sizeof(*na) + na_olen,
+						IPPROTO_ICMPV6,
+						csum_partial(na, sizeof(*na) + na_olen, 0));
+
+	pip6->payload_len = htons(sizeof(*na) + na_olen);
+
+	skb_push(reply, sizeof(struct ipv6hdr));
+	skb_push(reply, sizeof(struct ethhdr));
+
+	reply->ip_summed = CHECKSUM_UNNECESSARY;
+
+	if (p) {
+		struct net_bridge_vlan_group *vg;
+		u16 pvid;
+
+		vg = nbp_vlan_group_rcu(p);
+		pvid = br_get_pvid(vg);
+		if (pvid && pvid == vlan_tci)
+			vlan_tci = 0;
+	}
+
+	if (vlan_tci != 0) {
+		reply = vlan_insert_tag_set_proto(reply, vlan_proto, vlan_tci);
+		if (!reply) {
+			net_err_ratelimited("evpn: failed to insert VLAN tag\n");
+			return;
+		}
+	}
+
+	netdev_dbg(dev, "nd send dev %s dst %pI6 dst_hw %pM src %pI6 src_hw %pM\n",
+		   dev->name, &pip6->daddr, daddr, &pip6->saddr, n->ha);
+
+	dev_queue_xmit(reply);
+}
+
+static int br_chk_addr_ip6(struct net_device *dev, void *data)
+{
+	struct in6_addr *addr = (struct in6_addr *)data;
+
+	if (ipv6_chk_addr(dev_net(dev), addr, dev, 0))
+		return 1;
+
+	return 0;
+}
+
+static bool br_is_local_ip6(struct net_device *dev, struct in6_addr *addr)
+
+{
+	if (br_chk_addr_ip6(dev, addr))
+		return true;
+
+	/* check if ip is configured on upper dev */
+	if (netdev_walk_all_upper_dev_rcu(dev, br_chk_addr_ip6, addr))
+		return true;
+
+	return false;
+}
+
+void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
+		       u16 vid, struct net_bridge_port *p, struct nd_msg *msg)
+{
+	struct net_device *dev = br->dev;
+	struct net_device *vlandev = NULL;
+	struct in6_addr *saddr, *daddr;
+	struct ipv6hdr *iphdr;
+	struct inet6_dev *in6_dev;
+	struct neighbour *n;
+
+	BR_INPUT_SKB_CB(skb)->proxyarp_replied = false;
+
+	if (p && (p->flags & BR_NEIGH_SUPPRESS))
+		return;
+
+	if (msg->icmph.icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT &&
+	    !msg->icmph.icmp6_solicited) {
+		/* prevent flooding to neigh suppress ports */
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		return;
+	}
+
+	if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
+		return;
+
+	in6_dev = __in6_dev_get(dev);
+	if (!in6_dev)
+		return;
+
+	iphdr = ipv6_hdr(skb);
+	saddr = &iphdr->saddr;
+	daddr = &iphdr->daddr;
+
+	if (ipv6_addr_any(saddr) || !ipv6_addr_cmp(saddr, daddr)) {
+		/* prevent flooding to neigh suppress ports */
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		return;
+	}
+
+	if (vid != 0) {
+		/* build neigh table lookup on the vlan device */
+		vlandev = __vlan_find_dev_deep_rcu(br->dev, skb->vlan_proto,
+						   vid);
+		if (!vlandev)
+			return;
+	} else {
+		vlandev = dev;
+	}
+
+	if (br_is_local_ip6(vlandev, &msg->target)) {
+		/* its our own ip, so don't proxy reply
+		 * and don't forward to arp suppress ports
+		 */
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		return;
+	}
+
+	n = neigh_lookup(ipv6_stub->nd_tbl, &msg->target, vlandev);
+	if (n) {
+		struct net_bridge_fdb_entry *f;
+
+		if (!(n->nud_state & NUD_VALID)) {
+			neigh_release(n);
+			return;
+		}
+
+		f = br_fdb_find_rcu(br, n->ha, vid);
+		if (f) {
+			bool replied = false;
+
+			if (f->dst && (f->dst->flags & BR_NEIGH_SUPPRESS)) {
+				if (vid != 0)
+					br_nd_send(p, skb, n, skb->vlan_proto,
+						   skb_vlan_tag_get(skb), msg);
+				else
+					br_nd_send(p, skb, n, 0, 0, msg);
+				replied = true;
+			}
+
+			/* If we have replied or as long as we know the
+			 * mac, indicate to NEIGH_SUPPRESS ports that we
+			 * have replied
+			 */
+			if (replied || br->neigh_suppress_enabled)
+				BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		}
+		neigh_release(n);
+	}
+}
+#endif
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 53d1456..c85345a 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -69,6 +69,17 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	     eth->h_proto == htons(ETH_P_RARP)) &&
 	    br->neigh_suppress_enabled) {
 		br_do_proxy_suppress_arp(skb, br, vid, NULL);
+	} else if (IS_ENABLED(CONFIG_IPV6) &&
+		   skb->protocol == htons(ETH_P_IPV6) &&
+		   br->neigh_suppress_enabled &&
+		   pskb_may_pull(skb, sizeof(struct ipv6hdr) +
+				 sizeof(struct nd_msg)) &&
+		   ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
+			struct nd_msg *msg, _msg;
+
+			msg = br_is_nd_neigh_msg(skb, &_msg);
+			if (msg)
+				br_do_suppress_nd(skb, br, vid, NULL, msg);
 	}
 
 	dest = eth_hdr(skb)->h_dest;
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 4b8d2ec..013b65f 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -119,6 +119,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 	    (skb->protocol == htons(ETH_P_ARP) ||
 	     skb->protocol == htons(ETH_P_RARP))) {
 		br_do_proxy_suppress_arp(skb, br, vid, p);
+	} else if (IS_ENABLED(CONFIG_IPV6) && br->neigh_suppress_enabled &&
+		   skb->protocol == htons(ETH_P_IPV6) &&
+		   pskb_may_pull(skb, sizeof(struct ipv6hdr) +
+				 sizeof(struct nd_msg)) &&
+		   ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
+			struct nd_msg *msg, _msg;
+
+			msg = br_is_nd_neigh_msg(skb, &_msg);
+			if (msg)
+				br_do_suppress_nd(skb, br, vid, p, msg);
 	}
 
 	switch (pkt_type) {
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index bb095dc..f6936e9 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -1143,4 +1143,7 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
 void br_recalculate_neigh_suppress_enabled(struct net_bridge *br);
 void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 			      u16 vid, struct net_bridge_port *p);
+void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
+		       u16 vid, struct net_bridge_port *p, struct nd_msg *msg);
+struct nd_msg *br_is_nd_neigh_msg(struct sk_buff *skb, struct nd_msg *m);
 #endif
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH iproute2 2/3] include: add <linux/vm_sockets_diag.h>
From: Stephen Hemminger @ 2017-10-03 18:24 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171003175744.24987-3-stefanha@redhat.com>

On Tue,  3 Oct 2017 13:57:43 -0400
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> This new Linux header file defines the sock_diag interface used by
> AF_VSOCK.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/linux/vm_sockets_diag.h | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>  create mode 100644 include/linux/vm_sockets_diag.h
> 

All header files in linux must be headers that come from 'make install_headers'
in kernel. I don't see vm_sockets_diag.h even in net-next.

^ permalink raw reply

* Re: [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: Stephen Hemminger @ 2017-10-03 18:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171003175744.24987-2-stefanha@redhat.com>

On Tue,  3 Oct 2017 13:57:42 -0400
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> Linux has more than 32 address families defined in <bits/socket.h>.  Use
> a 64-bit type so all of them can be represented in the filter->families
> bitmask.
> 
> It's easy to introduce bugs when using (1 << AF_FAMILY) because the
> value is 32-bit.  This can produce incorrect results from bitmask
> operations so introduce the FAMILY_MASK() macro to eliminate these bugs.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  misc/ss.c | 54 ++++++++++++++++++++++++++++--------------------------
>  1 file changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/misc/ss.c b/misc/ss.c
> index dd8dfaa4..12a31c90 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -170,55 +170,57 @@ enum {
>  struct filter {
>  	int dbs;
>  	int states;
> -	int families;
> +	__u64 families;

Since this isn't a value that is coming from kernel. It should be uint64_t
rather than __u64.

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
From: Stephen Hemminger @ 2017-10-03 18:29 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: davem, netdev, nikolay, bridge
In-Reply-To: <1507054876-16746-2-git-send-email-roopa@cumulusnetworks.com>

On Tue,  3 Oct 2017 11:21:14 -0700
Roopa Prabhu <roopa@cumulusnetworks.com> wrote:

> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> index 48fb174..7a50dc5 100644
> --- a/net/bridge/br_forward.c
> +++ b/net/bridge/br_forward.c
> @@ -204,7 +204,8 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb,
>  		/* Do not flood to ports that enable proxy ARP */
>  		if (p->flags & BR_PROXYARP)
>  			continue;
> -		if ((p->flags & BR_PROXYARP_WIFI) &&
> +		if ((p->flags & BR_PROXYARP_WIFI ||
> +		     p->flags & BR_NEIGH_SUPPRESS) &&
>  		    BR_INPUT_SKB_CB(skb)->proxyarp_replied)
>  			continue;

Don;t you need additional paren here to avoid warnings.
Or do one mask:
		if ((p->flags & (BR_PROXYARP_WIFI | BR_NEIGH_SUPPRESS)) &&
 		    BR_INPUT_SKB_CB(skb)->proxyarp_replied)
  			continue;

^ permalink raw reply

* Re: [PATCH] net: phy: DP83822 initial driver submission
From: Florian Fainelli @ 2017-10-03 18:31 UTC (permalink / raw)
  To: Dan Murphy, andrew; +Cc: netdev
In-Reply-To: <ccab5880-eace-503c-d325-d20867b98bd5@ti.com>

On 10/03/2017 11:03 AM, Dan Murphy wrote:
> Florian
> 
> Thanks for the review
> 
> On 10/03/2017 12:15 PM, Florian Fainelli wrote:
>>> +		} else {
>>> +			value &= ~DP83822_WOL_SECURE_ON;
>>> +		}
>>> +
>>> +		value |= (DP83822_WOL_EN | DP83822_WOL_CLR_INDICATION |
>>> +			  DP83822_WOL_CLR_INDICATION);
>>
>> The extra parenthesis should not be required here.
> 
> I did not code that in.  I had to add it after Checkpatch cribbed about it.
> Let me know if you want me to remove it.

Let's keep those, that does not change much.

> 
>>
>>> +		phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG,
>>> +			      value);
>>> +	} else {
>>> +		value =
>>> +		    phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>>> +		value &= (~DP83822_WOL_EN);
>>
>> Same here, parenthesis should not be needed.
> 
> There are three lines of code in the else.  This code all needs to be excuted in the else case.
> I might reformat it to read better.  Lindent messed that one up.

sorry, I meant to write that you don't need the parenthesis around
DP83822_WOL_EN since that is just a single bit here.

[snip]

>>> +
>>> +	mutex_unlock(&phydev->lock);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int dp83822_resume(struct phy_device *phydev)
>>> +{
>>> +	int value;
>>> +
>>> +	mutex_lock(&phydev->lock);
>>> +
>>> +	value = phy_read(phydev, MII_BMCR);
>>> +	phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
>>
>> And genphy_resume() here as well?
> 
> genphy_resume does not have WoL.

I should have been cleared, I meant using genphy_{suspend,resume} to
avoid open coding the setting of the BMCR_PDOWN bit, conversely clearing
of that bit. Because of the locking, maybe you could introduce unlocked
versions of these two routines, or you acquire and release the lock
outside of genphy_{suspend,resume}?

> 
>>
>>> +
>>> +	value = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>>> +
>>> +	phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG, value |
>>> +		      DP83822_WOL_CLR_INDICATION);
>>> +
>>> +	mutex_unlock(&phydev->lock);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct phy_driver dp83822_driver[] = {
>>> +	{
>>> +	 .phy_id = DP83822_PHY_ID,
>>> +	 .phy_id_mask = 0xfffffff0,
>>> +	 .name = "TI DP83822",
>>> +	 .features = PHY_BASIC_FEATURES,
>>> +	 .flags = PHY_HAS_INTERRUPT,
>>> +
>>> +	 .config_init = genphy_config_init,
>>> +	 .soft_reset = dp83822_phy_reset,
>>> +
>>> +	 .get_wol = dp83822_get_wol,
>>> +	 .set_wol = dp83822_set_wol,
>>> +
>>> +	 /* IRQ related */
>>> +	 .ack_interrupt = dp83822_ack_interrupt,
>>> +	 .config_intr = dp83822_config_intr,
>>> +
>>> +	 .config_aneg = genphy_config_aneg,
>>> +	 .read_status = genphy_read_status,
>>> +	 .suspend = dp83822_suspend,
>>> +	 .resume = dp83822_resume,
>>> +	 },
>>
>> I would omit newlines between definitions of callbacks, but this is
>> really a personal preference. Unless you are planning on adding new IDs,
>> you could also avoid using an array of 1 element and just a plain
>> phy_driver structure, but that's not a big deal either.
> 
> Yes there is a plan to add another phy id in early 2018 to this driver.

Alright then!
-- 
Florian

^ permalink raw reply

* Re: [PATCH v4 net-next 0/8] flow_dissector: Protocol specific flow dissector offload
From: Tom Herbert @ 2017-10-03 18:35 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Tom Herbert, David Miller, Hannes Frederic Sowa,
	Linux Kernel Network Developers, Rohit Seth
In-Reply-To: <20171003074632.GD1916@nanopsycho>

On Tue, Oct 3, 2017 at 12:46 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Sep 29, 2017 at 07:59:35PM CEST, tom@herbertland.com wrote:
>>On Fri, Sep 29, 2017 at 10:42 AM, David Miller <davem@davemloft.net> wrote:
>>> From: Tom Herbert <tom@herbertland.com>
>>> Date: Fri, 29 Sep 2017 08:48:55 -0700
>>>
>>>> The flow_dissector interface is not a uAPI.
>>>
>>> That's not true, insofar as cls_flower.c uses the flow_dissector
>>> therefore if you change the flow_dissector in certain ways then
>>> cls_flower.c might have it's behavior changed and that is in fact UAPI
>>> facing.
>>
>>Then I would suggest adding another flag like FLOW_DISSECTOR_F_FLOWER
>>and when anyone puts new code into flow_dissector they can wrap it
>>with "if !(flags & FLOW_DISSECTOR_F_FLOWER)". If the flower uAPI is
>>subsequently update then the conditional can be removed. This way
>>flower can support maintain its APIs, but we can still still extend
>>and improve flow_dissector for othersuse cases.
>
> This is not flower-specific problem. Flow_dissector is a servant of many.

Besides flower, what other use cases of flow_dissector have made
flow_dissector interface a uAPI? Any use of hashing does not do this.
Maybe OVS does?

> As such, it is instructed what should it do. If you want to
> change the way inner headers are parsed, you should either:

Why would that only affect the way inner headers are parsed? Wouldn't
we need to consider any change to flow_dissector that might affect the
output in any way. For instance, the depth limits I added would change
to output for someone that was parsing thirty-five layers of
encapsulation so it it looks like that feature needs a flag. What if
someone adds a new Ethernet protocol or a new encap protocol?

> 1) change the callers so they are behaving the same as before
> 2) make the flow_dissection change optional so the caller can say if he
>    wants original or new behaviour.

I guess we can do that, but am concerned about the overhead this will
generate if were adding a flag each time anyone modifies the function.
There are performance critical use cases of flow_dissector that will
be impacted by such changes.

Tom


>

^ permalink raw reply

* Re: [RFC 1/2] bpf: move instruction printing into a separate file
From: Daniel Borkmann @ 2017-10-03 19:32 UTC (permalink / raw)
  To: Jakub Kicinski, dsahern, alexei.starovoitov
  Cc: netdev, oss-drivers, david.beckett
In-Reply-To: <20171003175746.30145-1-jakub.kicinski@netronome.com>

On 10/03/2017 07:57 PM, Jakub Kicinski wrote:
> Separate the instruction printing into a standalone source file.
> This way sneaky code from tools/ can use it directly.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> Like this?

Looks good to me, yes.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox