Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 3/8] sctp: move the flush of ctrl chunks into its own function
From: Marcelo Ricardo Leitner @ 2018-05-14  0:21 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long
In-Reply-To: <629f02a784ff90761adee451a870c7548069287d.1526142784.git.marcelo.leitner@gmail.com>

On Sat, May 12, 2018 at 07:21:01PM -0300, Marcelo Ricardo Leitner wrote:
> Named sctp_outq_flush_ctrl and, with that, keep the contexts contained.

Trinity triggered a panic on this patch. I'm debugging it.

^ permalink raw reply

* Re: [PATCH net] xfrm6: avoid potential infinite loop in _decode_session6()
From: David Miller @ 2018-05-14  0:23 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, steffen.klassert, nicolas.dichtel
In-Reply-To: <20180512094930.77801-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Sat, 12 May 2018 02:49:30 -0700

> syzbot found a way to trigger an infinitie loop by overflowing
> @offset variable that has been forced to use u16 for some very
> obscure reason in the past.
> 
> We probably want to look at NEXTHDR_FRAGMENT handling which looks
> wrong, in a separate patch.
> 
> In net-next, we shall try to use skb_header_pointer() instead of
> pskb_may_pull().
 ...
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Steffen Klassert <steffen.klassert@secunet.com>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Reported-by: syzbot+0053c8...@syzkaller.appspotmail.com

Steffen, I am assuming you will pick this up.

Thank you.

^ permalink raw reply

* Re: [PATCH] 3c59x: convert to generic DMA API
From: David Miller @ 2018-05-14  0:23 UTC (permalink / raw)
  To: hch; +Cc: netdev, linux-pci, linux-kernel, tedheadster
In-Reply-To: <20180512101650.1693-1-hch@lst.de>

From: Christoph Hellwig <hch@lst.de>
Date: Sat, 12 May 2018 12:16:50 +0200

> This driver supports EISA devices in addition to PCI devices, and relied
> on the legacy behavior of the pci_dma* shims to pass on a NULL pointer
> to the DMA API, and the DMA API being able to handle that.  When the
> NULL forwarding broke the EISA support got broken.  Fix this by converting
> to the DMA API instead of the legacy PCI shims.
> 
> Fixes: 4167b2ad ("PCI: Remove NULL device handling from PCI DMA API")
> Reported-by: tedheadster <tedheadster@gmail.com>
> Tested-by: tedheadster <tedheadster@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* [PATCH bpf-next 0/6] Minor follow-up cleanups in BPF JITs and optimized imm emission
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann

This series follows up mostly with with some minor cleanups on top
of 'Move ld_abs/ld_ind to native BPF' as well as implements better
32/64 bit immediate load into register for the arm64 JIT. For details
please see individual patches. Thanks!

Daniel Borkmann (6):
  bpf, mips: remove unused function
  bpf, sparc: remove unused variable
  bpf, x64: clean up retpoline emission slightly
  bpf, arm32: save 4 bytes of unneeded stack space
  bpf, arm64: save 4 bytes of unneeded stack space
  bpf, arm64: optimize 32/64 immediate emission

 arch/arm/net/bpf_jit_32.c            | 13 ++---
 arch/arm64/net/bpf_jit_comp.c        | 93 ++++++++++++++++++++++--------------
 arch/mips/net/ebpf_jit.c             | 26 ----------
 arch/sparc/net/bpf_jit_comp_64.c     |  1 -
 arch/x86/include/asm/nospec-branch.h | 29 ++++++-----
 5 files changed, 74 insertions(+), 88 deletions(-)

-- 
2.9.5

^ permalink raw reply

* [PATCH bpf-next 1/6] bpf, mips: remove unused function
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

The ool_skb_header_pointer() and size_to_len() is unused same as
tmp_offset, therefore remove all of them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/mips/net/ebpf_jit.c | 26 --------------------------
 1 file changed, 26 deletions(-)

diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index 7ba7df9..aeb7b1b 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -95,7 +95,6 @@ enum reg_val_type {
  * struct jit_ctx - JIT context
  * @skf:		The sk_filter
  * @stack_size:		eBPF stack size
- * @tmp_offset:		eBPF $sp offset to 8-byte temporary memory
  * @idx:		Instruction index
  * @flags:		JIT flags
  * @offsets:		Instruction offsets
@@ -105,7 +104,6 @@ enum reg_val_type {
 struct jit_ctx {
 	const struct bpf_prog *skf;
 	int stack_size;
-	int tmp_offset;
 	u32 idx;
 	u32 flags;
 	u32 *offsets;
@@ -293,7 +291,6 @@ static int gen_int_prologue(struct jit_ctx *ctx)
 	locals_size = (ctx->flags & EBPF_SEEN_FP) ? MAX_BPF_STACK : 0;
 
 	stack_adjust += locals_size;
-	ctx->tmp_offset = locals_size;
 
 	ctx->stack_size = stack_adjust;
 
@@ -399,7 +396,6 @@ static void gen_imm_to_reg(const struct bpf_insn *insn, int reg,
 		emit_instr(ctx, lui, reg, upper >> 16);
 		emit_instr(ctx, addiu, reg, reg, lower);
 	}
-
 }
 
 static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
@@ -547,28 +543,6 @@ static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	return 0;
 }
 
-static void * __must_check
-ool_skb_header_pointer(const struct sk_buff *skb, int offset,
-		       int len, void *buffer)
-{
-	return skb_header_pointer(skb, offset, len, buffer);
-}
-
-static int size_to_len(const struct bpf_insn *insn)
-{
-	switch (BPF_SIZE(insn->code)) {
-	case BPF_B:
-		return 1;
-	case BPF_H:
-		return 2;
-	case BPF_W:
-		return 4;
-	case BPF_DW:
-		return 8;
-	}
-	return 0;
-}
-
 static void emit_const_to_reg(struct jit_ctx *ctx, int dst, u64 value)
 {
 	if (value >= 0xffffffffffff8000ull || value < 0x8000ull) {
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 2/6] bpf, sparc: remove unused variable
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

Since fe83963b7c38 ("bpf, sparc64: remove ld_abs/ld_ind") it's not
used anymore therefore remove it.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/sparc/net/bpf_jit_comp_64.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index 9f5918e..222785a 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -894,7 +894,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	const int i = insn - ctx->prog->insnsi;
 	const s16 off = insn->off;
 	const s32 imm = insn->imm;
-	u32 *func;
 
 	if (insn->src_reg == BPF_REG_FP)
 		ctx->saw_frame_pointer = true;
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 3/6] bpf, x64: clean up retpoline emission slightly
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

Make the RETPOLINE_{RA,ED}X_BPF_JIT() a bit more readable by
cleaning up the macro, aligning comments and spacing.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/x86/include/asm/nospec-branch.h | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 2cd344d..2f700a1d 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -301,9 +301,9 @@ do {									\
  *    jmp *%edx for x86_32
  */
 #ifdef CONFIG_RETPOLINE
-#ifdef CONFIG_X86_64
-# define RETPOLINE_RAX_BPF_JIT_SIZE	17
-# define RETPOLINE_RAX_BPF_JIT()				\
+# ifdef CONFIG_X86_64
+#  define RETPOLINE_RAX_BPF_JIT_SIZE	17
+#  define RETPOLINE_RAX_BPF_JIT()				\
 do {								\
 	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
 	/* spec_trap: */					\
@@ -314,8 +314,8 @@ do {								\
 	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
 	EMIT1(0xC3);             /* retq */			\
 } while (0)
-#else
-# define RETPOLINE_EDX_BPF_JIT()				\
+# else /* !CONFIG_X86_64 */
+#  define RETPOLINE_EDX_BPF_JIT()				\
 do {								\
 	EMIT1_off32(0xE8, 7);	 /* call do_rop */		\
 	/* spec_trap: */					\
@@ -326,17 +326,16 @@ do {								\
 	EMIT3(0x89, 0x14, 0x24); /* mov %edx,(%esp) */		\
 	EMIT1(0xC3);             /* ret */			\
 } while (0)
-#endif
+# endif
 #else /* !CONFIG_RETPOLINE */
-
-#ifdef CONFIG_X86_64
-# define RETPOLINE_RAX_BPF_JIT_SIZE	2
-# define RETPOLINE_RAX_BPF_JIT()				\
-	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
-#else
-# define RETPOLINE_EDX_BPF_JIT()				\
-	EMIT2(0xFF, 0xE2) /* jmp *%edx */
-#endif
+# ifdef CONFIG_X86_64
+#  define RETPOLINE_RAX_BPF_JIT_SIZE	2
+#  define RETPOLINE_RAX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE0);       /* jmp *%rax */
+# else /* !CONFIG_X86_64 */
+#  define RETPOLINE_EDX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE2)        /* jmp *%edx */
+# endif
 #endif
 
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 6/6] bpf, arm64: optimize 32/64 immediate emission
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

Improve the JIT to emit 64 and 32 bit immediates, the current
algorithm is not optimal and we often emit more instructions
than actually needed. arm64 has movz, movn, movk variants but
for the current 64 bit immediates we only use movz with a
series of movk when needed.

For example loading ffffffffffffabab emits the following 4
instructions in the JIT today:

  * movz: abab, shift:  0, result: 000000000000abab
  * movk: ffff, shift: 16, result: 00000000ffffabab
  * movk: ffff, shift: 32, result: 0000ffffffffabab
  * movk: ffff, shift: 48, result: ffffffffffffabab

Whereas after the patch the same load only needs a single
instruction:

  * movn: 5454, shift:  0, result: ffffffffffffabab

Another example where two extra instructions can be saved:

  * movz: abab, shift:  0, result: 000000000000abab
  * movk: 1f2f, shift: 16, result: 000000001f2fabab
  * movk: ffff, shift: 32, result: 0000ffff1f2fabab
  * movk: ffff, shift: 48, result: ffffffff1f2fabab

After the patch:

  * movn: e0d0, shift: 16, result: ffffffff1f2fffff
  * movk: abab, shift:  0, result: ffffffff1f2fabab

Another example with movz, before:

  * movz: 0000, shift:  0, result: 0000000000000000
  * movk: fea0, shift: 32, result: 0000fea000000000

After:

  * movz: fea0, shift: 32, result: 0000fea000000000

Moreover, reuse emit_a64_mov_i() for 32 bit immediates that
are loaded via emit_a64_mov_i64() which is a similar optimization
as done in 6fe8b9c1f41d ("bpf, x64: save several bytes by using
mov over movabsq when possible"). On arm64, the latter allows to
use a single instruction with movn due to zero extension where
otherwise two would be needed. And last but not least add a
missing optimization in emit_a64_mov_i() where movn is used but
the subsequent movk not needed. With some of the Cilium programs
in use, this shrinks the needed instructions by about three
percent. Tested on Cavium ThunderX CN8890.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/arm64/net/bpf_jit_comp.c | 86 +++++++++++++++++++++++++++----------------
 1 file changed, 55 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 85113ca..fb75e9c 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -79,23 +79,67 @@ static inline void emit(const u32 insn, struct jit_ctx *ctx)
 	ctx->idx++;
 }
 
+static inline void emit_a64_mov_i(const int is64, const int reg,
+				  const s32 val, struct jit_ctx *ctx)
+{
+	u16 hi = val >> 16;
+	u16 lo = val & 0xffff;
+
+	if (hi & 0x8000) {
+		if (hi == 0xffff) {
+			emit(A64_MOVN(is64, reg, (u16)~lo, 0), ctx);
+		} else {
+			emit(A64_MOVN(is64, reg, (u16)~hi, 16), ctx);
+			if (lo != 0xffff)
+				emit(A64_MOVK(is64, reg, lo, 0), ctx);
+		}
+	} else {
+		emit(A64_MOVZ(is64, reg, lo, 0), ctx);
+		if (hi)
+			emit(A64_MOVK(is64, reg, hi, 16), ctx);
+	}
+}
+
+static int i64_i16_blocks(const u64 val, bool inverse)
+{
+	return (((val >>  0) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
+	       (((val >> 16) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
+	       (((val >> 24) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
+	       (((val >> 32) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
+	       (((val >> 48) & 0xffff) != (inverse ? 0xffff : 0x0000));
+}
+
 static inline void emit_a64_mov_i64(const int reg, const u64 val,
 				    struct jit_ctx *ctx)
 {
-	u64 tmp = val;
-	int shift = 0;
-
-	emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
-	tmp >>= 16;
-	shift += 16;
-	while (tmp) {
-		if (tmp & 0xffff)
-			emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
-		tmp >>= 16;
-		shift += 16;
+	u64 nrm_tmp = val, rev_tmp = ~val;
+	bool inverse;
+	int shift;
+
+	if (!(nrm_tmp >> 32))
+		return emit_a64_mov_i(0, reg, (u32)val, ctx);
+
+	inverse = i64_i16_blocks(nrm_tmp, true) < i64_i16_blocks(nrm_tmp, false);
+	shift = max(round_down((inverse ? (fls64(rev_tmp) - 1) :
+					  (fls64(nrm_tmp) - 1)), 16), 0);
+	if (inverse)
+		emit(A64_MOVN(1, reg, (rev_tmp >> shift) & 0xffff, shift), ctx);
+	else
+		emit(A64_MOVZ(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
+	shift -= 16;
+	while (shift >= 0) {
+		if (((nrm_tmp >> shift) & 0xffff) != (inverse ? 0xffff : 0x0000))
+			emit(A64_MOVK(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
+		shift -= 16;
 	}
 }
 
+/*
+ * This is an unoptimized 64 immediate emission used for BPF to BPF call
+ * addresses. It will always do a full 64 bit decomposition as otherwise
+ * more complexity in the last extra pass is required since we previously
+ * reserved 4 instructions for the address.
+ */
 static inline void emit_addr_mov_i64(const int reg, const u64 val,
 				     struct jit_ctx *ctx)
 {
@@ -110,26 +154,6 @@ static inline void emit_addr_mov_i64(const int reg, const u64 val,
 	}
 }
 
-static inline void emit_a64_mov_i(const int is64, const int reg,
-				  const s32 val, struct jit_ctx *ctx)
-{
-	u16 hi = val >> 16;
-	u16 lo = val & 0xffff;
-
-	if (hi & 0x8000) {
-		if (hi == 0xffff) {
-			emit(A64_MOVN(is64, reg, (u16)~lo, 0), ctx);
-		} else {
-			emit(A64_MOVN(is64, reg, (u16)~hi, 16), ctx);
-			emit(A64_MOVK(is64, reg, lo, 0), ctx);
-		}
-	} else {
-		emit(A64_MOVZ(is64, reg, lo, 0), ctx);
-		if (hi)
-			emit(A64_MOVK(is64, reg, hi, 16), ctx);
-	}
-}
-
 static inline int bpf2a64_offset(int bpf_to, int bpf_from,
 				 const struct jit_ctx *ctx)
 {
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 5/6] bpf, arm64: save 4 bytes of unneeded stack space
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

Follow-up to 816d9ef32a8b ("bpf, arm64: remove ld_abs/ld_ind") in
that the extra 4 byte JIT scratchpad is not needed anymore since it
was in ld_abs/ld_ind as stack buffer for bpf_load_pointer().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/arm64/net/bpf_jit_comp.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0b40c8f..85113ca 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -21,7 +21,6 @@
 #include <linux/bpf.h>
 #include <linux/filter.h>
 #include <linux/printk.h>
-#include <linux/skbuff.h>
 #include <linux/slab.h>
 
 #include <asm/byteorder.h>
@@ -188,7 +187,7 @@ static int build_prologue(struct jit_ctx *ctx)
 	 *                        | ... | BPF prog stack
 	 *                        |     |
 	 *                        +-----+ <= (BPF_FP - prog->aux->stack_depth)
-	 *                        |RSVD | JIT scratchpad
+	 *                        |RSVD | padding
 	 * current A64_SP =>      +-----+ <= (BPF_FP - ctx->stack_size)
 	 *                        |     |
 	 *                        | ... | Function call stack
@@ -220,9 +219,7 @@ static int build_prologue(struct jit_ctx *ctx)
 		return -1;
 	}
 
-	/* 4 byte extra for skb_copy_bits buffer */
-	ctx->stack_size = prog->aux->stack_depth + 4;
-	ctx->stack_size = STACK_ALIGN(ctx->stack_size);
+	ctx->stack_size = STACK_ALIGN(prog->aux->stack_depth);
 
 	/* Set up function call stack */
 	emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 4/6] bpf, arm32: save 4 bytes of unneeded stack space
From: Daniel Borkmann @ 2018-05-14  0:26 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180514002612.12083-1-daniel@iogearbox.net>

The extra skb_copy_bits() buffer is not used anymore, therefore
remove the extra 4 byte stack space requirement.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/arm/net/bpf_jit_32.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 82689b9..d3ea645 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -234,18 +234,11 @@ static void jit_fill_hole(void *area, unsigned int size)
 #define SCRATCH_SIZE 80
 
 /* total stack size used in JITed code */
-#define _STACK_SIZE \
-	(ctx->prog->aux->stack_depth + \
-	 + SCRATCH_SIZE + \
-	 + 4 /* extra for skb_copy_bits buffer */)
-
-#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
+#define _STACK_SIZE	(ctx->prog->aux->stack_depth + SCRATCH_SIZE)
+#define STACK_SIZE	ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
 
 /* Get the offset of eBPF REGISTERs stored on scratch space. */
-#define STACK_VAR(off) (STACK_SIZE-off-4)
-
-/* Offset of skb_copy_bits buffer */
-#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)
+#define STACK_VAR(off) (STACK_SIZE - off)
 
 #if __LINUX_ARM_ARCH__ < 7
 
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH net] qede: Fix ref-cnt usage count
From: David Miller @ 2018-05-14  0:27 UTC (permalink / raw)
  To: Michal.Kalderon; +Cc: netdev, linux-rdma, chad.dupuis, ariel.elior
In-Reply-To: <20180513175406.21350-1-Michal.Kalderon@cavium.com>

From: Michal Kalderon <Michal.Kalderon@cavium.com>
Date: Sun, 13 May 2018 20:54:06 +0300

> Rebooting while qedr is loaded with a VLAN interface present
> results in unregister_netdevice waiting for the usage count
> to become free.
> The fix is that rdma devices should be removed before unregistering
> the netdevice, to assure all references to ndev are decreased.
> 
> Fixes: cee9fbd8e2e9 ("qede: Add qedr framework")
> Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: ipv4: ipconfig: fix unused variable
From: David Miller @ 2018-05-14  0:28 UTC (permalink / raw)
  To: anders.roxell; +Cc: kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <20180513194830.29946-1-anders.roxell@linaro.org>

From: Anders Roxell <anders.roxell@linaro.org>
Date: Sun, 13 May 2018 21:48:30 +0200

> When CONFIG_PROC_FS isn't set, variable ipconfig_dir isn't used.
> net/ipv4/ipconfig.c:167:31: warning: ‘ipconfig_dir’ defined but not used [-Wunused-variable]
>  static struct proc_dir_entry *ipconfig_dir;
>                                ^~~~~~~~~~~~
> Move the declaration of ipconfig_dir inside the CONFIG_PROC_FS ifdef to
> fix the warning.
> 
> Fixes: c04d2cb2009f ("ipconfig: Write NTP server IPs to /proc/net/ipconfig/ntp_servers")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Applied.

^ permalink raw reply

* pull-request: bpf 2018-05-14
From: Daniel Borkmann @ 2018-05-14  0:47 UTC (permalink / raw)
  To: davem; +Cc: daniel, ast, netdev

Hi David,

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix nfp to allow zero-length BPF capabilities, meaning the nfp
   capability parsing loop will otherwise exit early if the last
   capability is zero length and therefore driver will fail to probe
   with an error such as:

     nfp: BPF capabilities left after parsing, parsed:92 total length:100
     nfp: invalid BPF capabilities at offset:92

   Fix from Jakub.

2) libbpf's bpf_object__open() may return IS_ERR_OR_NULL() and not
   just an error. Fix libbpf's bpf_prog_load_xattr() to handle that
   case as well, also from Jakub.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

----------------------------------------------------------------

The following changes since commit 3148dedfe79e422f448a10250d3e2cdf8b7ee617:

  r8169: fix powering up RTL8168h (2018-05-08 22:54:18 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git 

for you to fetch changes up to 3597683c9da602b0440c5f742d64fa5da79cc026:

  tools: bpf: handle NULL return in bpf_prog_load_xattr() (2018-05-11 00:20:53 +0200)

----------------------------------------------------------------
Jakub Kicinski (2):
      nfp: bpf: allow zero-length capabilities
      tools: bpf: handle NULL return in bpf_prog_load_xattr()

 drivers/net/ethernet/netronome/nfp/bpf/main.c | 2 +-
 tools/lib/bpf/libbpf.c                        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

^ permalink raw reply

* Re: [PATCH 00/15] Netfilter/IPVS fixes for net
From: David Miller @ 2018-05-14  1:05 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20180513223656.10077-1-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 14 May 2018 00:36:41 +0200

> The following patchset contains Netfilter/IPVS fixes for your net tree,
> they are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH bpf-next 2/6] bpf, sparc: remove unused variable
From: David Miller @ 2018-05-14  1:06 UTC (permalink / raw)
  To: daniel; +Cc: alexei.starovoitov, netdev
In-Reply-To: <20180514002612.12083-3-daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Mon, 14 May 2018 02:26:08 +0200

> Since fe83963b7c38 ("bpf, sparc64: remove ld_abs/ld_ind") it's not
> used anymore therefore remove it.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: pull-request: bpf 2018-05-14
From: David Miller @ 2018-05-14  1:07 UTC (permalink / raw)
  To: daniel; +Cc: ast, netdev
In-Reply-To: <20180514004710.14377-1-daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Mon, 14 May 2018 02:47:10 +0200

> The following pull-request contains BPF updates for your *net* tree.
> 
> The main changes are:
> 
> 1) Fix nfp to allow zero-length BPF capabilities, meaning the nfp
>    capability parsing loop will otherwise exit early if the last
>    capability is zero length and therefore driver will fail to probe
>    with an error such as:
> 
>      nfp: BPF capabilities left after parsing, parsed:92 total length:100
>      nfp: invalid BPF capabilities at offset:92
> 
>    Fix from Jakub.
> 
> 2) libbpf's bpf_object__open() may return IS_ERR_OR_NULL() and not
>    just an error. Fix libbpf's bpf_prog_load_xattr() to handle that
>    case as well, also from Jakub.
> 
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Pulled, thanks Daniel.

^ permalink raw reply

* Re:Re: Re: Re: [PATCH net] net: Correct wrong skb_flow_limit check when enable RPS
From: Gao Feng @ 2018-05-14  1:26 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: davem@davemloft.net, daniel@iogearbox.net,
	jakub.kicinski@netronome.com, David Ahern, netdev@vger.kernel.org
In-Reply-To: <CAF=yD-KNCVU0v0jnM4jaAV3nM90wY376UQ+2MMsrEaVwkMdS3A@mail.gmail.com>

At 2018-05-11 22:56:04, "Willem de Bruijn" <willemdebruijn.kernel@gmail.com> wrote:
>On Fri, May 11, 2018 at 10:44 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> At 2018-05-11 21:23:55, "Willem de Bruijn" <willemdebruijn.kernel@gmail.com> wrote:
>>>On Fri, May 11, 2018 at 2:20 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
>>>> At 2018-05-11 11:54:55, "Willem de Bruijn" <willemdebruijn.kernel@gmail.com> wrote:
>>>>>On Thu, May 10, 2018 at 4:28 AM,  <gfree.wind@vip.163.com> wrote:
>>>>>> From: Gao Feng <gfree.wind@vip.163.com>
>>>>>>
>>>>>> The skb flow limit is implemented for each CPU independently. In the
>>>>>> current codes, the function skb_flow_limit gets the softnet_data by
>>>>>> this_cpu_ptr. But the target cpu of enqueue_to_backlog would be not
>>>>>> the current cpu when enable RPS. As the result, the skb_flow_limit checks
>>>>>> the stats of current CPU, while the skb is going to append the queue of
>>>>>> another CPU. It isn't the expected behavior.
>>>>>>
>>>>>> Now pass the softnet_data as a param to softnet_data to make consistent.
>>>>>
>>>>>The local cpu softnet_data is used on purpose. The operations in
>>>>>skb_flow_limit() on sd fields could race if not executed on the local cpu.
>>>>
>>>> I think the race doesn't exist because of the rps_lock.
>>>> The enqueue_to_backlog has hold the rps_lock before skb_flow_limit.
>>>
>>>Indeed, I overlooked that. There still is the matter of cache contention.
>>
>> The cache contention is really important in this case?
>> I don't think so, because the enqueue_to_backlog have touched and modified the softnet_stat
>> of target cpu.
>>
>>>
>>>>>Flow limit tries to detect large ("elephant") DoS flows with a fixed four-tuple.
>>>>>These would always hit the same RPS cpu, so that cpu being backlogged
>>>>
>>>> They may hit the different target CPU when enable RFS. Because the app could be scheduled
>>>> to another CPU, then RFS tries to deliver the skb to latest core which has hot cache.
>>>
>>>This even more suggest using the initial (or IRQ) cpu to track state, instead
>>>of the destination (RPS/RFS) cpu.
>>
>> I couldn't understand why it is better to track state on initial cpu, not the target cpu.
>> The latter one could get more accurate result.
>
>For a single DoS flow with normal cpu pinned IRQs, the results will be equally
>good when tracked on the initial IRQ cpu..
>
>>
>>>
>>>>>may be an indication that such a flow is active. But the flow will also always
>>>>>arrive on the same initial cpu courtesy of RSS. So storing the lookup table
>>>>
>>>> The RSS couldn't make sure the irq is handled by same cpu. It would be balanced between
>>>> the cpus.
>>>
>>>IRQs are usually pinned to cores. Unless using something like irqbalance,
>>>but that operates at too coarse a timescale to do anything useful at Mpps
>>>packet rates.
>>
>> There are some motherboard which couldn't make sure the irq is pinned.
>> The flow_limit wouldn't work as well as expected.
>
>.. this seems to be the crux of the argument. I am not aware of any network
>interrupts that do not adhere to the cpu pinning configuration in
>
>   /proc/irq/$IRQ/smp_affinity(_list)
>

When smp_affinity is configured 0xff, I met some hardwares they deliver most of
irqs to a specific cpu, and left irqs are spread to the other cpus. And it couldn't 
make sure the irq of one flow is delivered to a fixed cpu.

I don't know if it is caused by apic or motherboard.
And what information you need to confirm it.

>What kind of hardware ignores this setting and sprays interrupts? I agree
>that in that case flow_limit as is may be ineffective (if migration happens
>at rates comparable to packet rates). But this should not happen?
>
>>
>>>
>>>>>on the initial CPU is also fine. There may be false positives on other CPUs
>>>>>with the same RPS destination, but that is unlikely with a highly concurrent
>>>>>traffic server mix ("mice").
>>>>
>>>> If my comment is right, the flow couldn't always arrive one the same initial cpu,  although
>>>> it may be sent to one same target cpu.
>>>>
>>>>>
>>>>>Note that the sysctl net.core.flow_limit_cpu_bitmap enables the feature
>>>>>for the cpus on which traffic initially lands, not the RPS destination cpus.
>>>>>See also Documentation/networking/scaling.txt
>>>>>
>>>>>That said, I had to reread the code, as it does seem sensible that the
>>>>>same softnet_data is intended to be used both when testing qlen and
>>>>>flow_limit.
>>>>
>>>> In most cases, user configures the same RPS map with flow_limit like 0xff.
>>>> Because user couldn't predict which core the evil flow would arrive on.
>>>>
>>>> Take an example, there are 2 cores, cpu0 and cpu1.
>>>> One flow is the an evil flow, but the irq is sent to cpu0. After RPS/RFS, the target cpu is cpu1.
>>>> Now cpu0 invokes enqueue_to_backlog, then the skb_flow_limit checkes the queue length
>>>> of cpu0. Certainly it could pass the check of skb_flow_limit because there is no any evil flow on cpu0.
>>>
>>>No, enqueue_to_backlog passes qlen to skb_flow_limit, so that does
>>>check the queue length of the RPS cpu.
>>
>> Sorry, I overlooked the qlen is the length of the rps cpu.
>> Then it's ok unless the stats may be not accurate when irq isn't pinned.
>>
>> But I still doubt that is it really important to track state on initial cpu, not target cpu?
>> Because the enqueue_to_backlog have touched the softnet_data of target cpu.
>
>I think the merit of both IRQ and RPS cpu can be argued for attaching the
>flow_limit state.
>
>Either way, the current behavior is not a bug, so I don't think that this is a
>candidate for net.
>
>The cost of moving from IRQ to RPS cpu will be the cacheline contention
>on a system with multiple IRQ cpus that all try to update the sd->flow_data
>of the same RPS cpus. Which is particularly likely with RFS. I suspect that
>this cost is non-trivial and not worth the benefit of handling hardware with
>unpinned IRQs.

Ok, I agree with you.
Thanks your detail discussion with me.

Best Regards
Feng

^ permalink raw reply

* Re: [PATCH net V2] tun: fix use after free for ptr_ring
From: Jason Wang @ 2018-05-14  1:52 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, LKML, Eric Dumazet,
	Michael S. Tsirkin
In-Reply-To: <CAM_iQpU4YCrcp2XfAyFaUNtjLUi7SyoQ8exzMV7XnoW-=LV3sg@mail.gmail.com>



On 2018年05月12日 01:39, Cong Wang wrote:
> On Thu, May 10, 2018 at 7:49 PM, Jason Wang <jasowang@redhat.com> wrote:
>>   static void __tun_detach(struct tun_file *tfile, bool clean)
>>   {
>>          struct tun_file *ntfile;
>> @@ -736,7 +727,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>>                              tun->dev->reg_state == NETREG_REGISTERED)
>>                                  unregister_netdevice(tun->dev);
>>                  }
>> -               tun_cleanup_tx_ring(tfile);
>> +               if (tun)
>> +                       xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>                  sock_put(&tfile->sk);
>>          }
>>   }
>> @@ -783,14 +775,14 @@ static void tun_detach_all(struct net_device *dev)
>>                  tun_napi_del(tun, tfile);
>>                  /* Drop read queue */
>>                  tun_queue_purge(tfile);
>> +               xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>                  sock_put(&tfile->sk);
>> -               tun_cleanup_tx_ring(tfile);
>>          }
>>          list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
>>                  tun_enable_queue(tfile);
>>                  tun_queue_purge(tfile);
>> +               xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>                  sock_put(&tfile->sk);
>> -               tun_cleanup_tx_ring(tfile);
> Are you sure t is safe?
>
> xdp_rxq_info_unreg() can't be called more than once either,
> please make sure the warning that commit c13da21cdb80
> ("tun: avoid calling xdp_rxq_info_unreg() twice") fixed will not
> show up again.

I think it's safe. xdp_rxq_info_unreg() will be called when socket were 
detached from netdevice, and there's only two possible paths: release() 
and uninit(). We've synced them through rtnl lock.

Thanks

^ permalink raw reply

* Re: [PATCH 2/2] alx: add disable_wol paramenter
From: AceLan Kao @ 2018-05-14  1:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David Miller, James Cliburn, Chris Snook, rakesh, netdev,
	Linux-Kernel@Vger. Kernel. Org, Emily Chien
In-Reply-To: <20180510123425.GB5527@lunn.ch>

Okay, I'll submit a new patch with some more description of why we
need this feature.
Thanks.

2018-05-10 20:34 GMT+08:00 Andrew Lunn <andrew@lunn.ch>:
> On Thu, May 10, 2018 at 01:58:24PM +0800, AceLan Kao wrote:
>> Hi Andrew,
>>
>> We have some machines using Qualcomm Atheros Killer E2400 Gigabit
>> Ethernet Controller,
>> but none of them has the unintentional wake up issue.
>> We're willing to fix it if we encountered the issue, but before we can
>> do it, we need this feature is supported by the driver.
>>
>> Taking the feature has been removed for 5 years into account, I doubt
>> if we still can reproduce this issue,
>> but again, to verify this issue we need to add back this feature first.
>> Set WoL disabled by default won't introduce any regression but give
>> users and developers a chance to fix it.
>
> The main problem here is the module parameter. That is not going to be
> accepted.
>
> Can you argue the cure is worse than the disease? Is WoL not working
> considered by a lot of people as being a bug? Double wake up is also a
> bug, but not many people care, it does not cause any data corruption,
> etc. So can you argue overall we have a less buggy system, but still
> buggy, if WoL is enabled?
>
> If you can write a convincing Change Message arguing the case, a patch
> simply re-enabling WoL might be accepted.
>
> But you also need to take on the responsibility to help debug the
> failed shutdowns in order to get to the bottom of this problem.
>
>        Andrew

^ permalink raw reply

* linux-next: manual merge of the bpf-next tree with the bpf tree
From: Stephen Rothwell @ 2018-05-14  1:57 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
	Jakub Kicinski

[-- Attachment #1: Type: text/plain, Size: 1334 bytes --]

Hi all,

Today's linux-next merge of the bpf-next tree got a conflict in:

  tools/lib/bpf/libbpf.c

between commit:

  3597683c9da6 ("tools: bpf: handle NULL return in bpf_prog_load_xattr()")

from the bpf tree and commit:

  17387dd5ac2c ("tools: bpf: don't complain about no kernel version for networking code")

from the bpf-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc tools/lib/bpf/libbpf.c
index 8da4eeb101a6,df54c4c9e48a..000000000000
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@@ -2163,9 -2193,12 +2193,12 @@@ int bpf_prog_load_xattr(const struct bp
  
  	if (!attr)
  		return -EINVAL;
+ 	if (!attr->file)
+ 		return -EINVAL;
  
- 	obj = bpf_object__open(attr->file);
+ 	obj = __bpf_object__open(attr->file, NULL, 0,
+ 				 bpf_prog_type__needs_kver(attr->prog_type));
 -	if (IS_ERR(obj))
 +	if (IS_ERR_OR_NULL(obj))
  		return -ENOENT;
  
  	bpf_object__for_each_program(prog, obj) {

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: RTL8723BE performance regression
From: Pkshih @ 2018-05-14  2:50 UTC (permalink / raw)
  To: jprvita@gmail.com
  Cc: linux-kernel@vger.kernel.org, Larry.Finger@lwfinger.net,
	jprvita@endlessm.com, Birming Chiu, drake@endlessm.com,
	Chaoming_Li, kvalo@codeaurora.org, 莊彥宣,
	derosier@gmail.com, Steven Ting, netdev@vger.kernel.org,
	linux@endlessm.com, Shaofu, linux-wireless@vger.kernel.org
In-Reply-To: <CA+A7VXWhi24iw=TpoWQe6AY-pyvF6-C03dc+AN9kXeLPhLni+g@mail.gmail.com>

On Wed, 2018-05-09 at 13:33 -0700, João Paulo Rechi Vita wrote:
> On Tue, May 8, 2018 at 1:37 AM, Pkshih <pkshih@realtek.com> wrote:
> > On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote:
> >> On Tue, May 1, 2018 at 10:58 PM, Pkshih <pkshih@realtek.com> wrote:
> >> > On Wed, 2018-05-02 at 05:44 +0000, Pkshih wrote:
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: João Paulo Rechi Vita [mailto:jprvita@gmail.com]
> >> >> > Sent: Wednesday, May 02, 2018 6:41 AM
> >> >> > To: Larry Finger
> >> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; Chaoming_Li; Kalle
> Valo;
> >> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi Vita; linux@endl
> ess
> >> m.c
> >> >> om
> >> >> > Subject: Re: RTL8723BE performance regression
> >> >> >
> >> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <Larry.Finger@lwfinger.net> wrote:
> >> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
> >> >> > >>
> >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger <Larry.Finger@lwfinger.net>
> >> >> > >> wrote:
> >> >> > >>
> >> >> > >> (...)
> >> >> > >>
> >> >> > >>> As the antenna selection code changes affected your first bisection, do
> >> >> > >>> you
> >> >> > >>> have one of those HP laptops with only one antenna and the incorrect
> >> >> > >>> coding
> >> >> > >>> in the FUSE?
> >> >> > >>
> >> >> > >>
> >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
> >> >> > >> was needed to achieve a good performance in the past, before this
> >> >> > >> regression. I've also opened the laptop chassis and confirmed the
> >> >> > >> antenna cable is plugged to the connector labeled with "1" on the
> >> >> > >> card.
> >> >> > >>
> >> >> > >>> If so, please make sure that you still have the same signal
> >> >> > >>> strength for good and bad cases. I have tried to keep the driver and the
> >> >> > >>> btcoex code in sync, but there may be some combinations of antenna
> >> >> > >>> configuration and FUSE contents that cause the code to fail.
> >> >> > >>>
> >> >> > >>
> >> >> > >> What is the recommended way to monitor the signal strength?
> >> >> > >
> >> >> > >
> >> >> > > The btcoex code is developed for multiple platforms by a different group
> >> >> > > than the Linux driver. I think they made a change that caused ant_sel to
> >> >> > > switch from 1 to 2. At least numerous comments at
> >> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
> >> >> > >
> >> >> > > Mhy recommended method is to verify the wifi device name with "iw dev". Then
> >> >> > > using that device
> >> >> > >
> >> >> > > sudo iw dev <dev_name> scan | egrep "SSID|signal"
> >> >> > >
> >> >> >
> >> >> > I have confirmed that the performance regression is indeed tied to
> >> >> > signal strength: on the good cases signal was between -16 and -8 dBm,
> >> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
> >> >> > also switched to testing bandwidth in controlled LAN environment using
> >> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only
> >> >> > machine connected to the 2.4 GHz radio and the machine running the
> >> >> > iperf3 server connected via ethernet.
> >> >> >
> >> >>
> >> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup
> >> >> 8723be ant_sel definition"). You can use the above commit and do the same
> >> >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your results.
> >> >> Since performance is tied to signal strength, you can only share signal strength.
> >> >>
> >> >
> >> > Please pay attention to cold reboot once ant_sel is changed.
> >> >
> >>
> >> I've tested the commit mentioned above and it fixes the problem on top
> >> of v4.16 (in addition to the latest wireless-drivers-next also been
> >> fixed as it already contains such commit). On v4.15, we also need the
> >> following commits before "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel
> >> definition" to have a good performance again:
> >>
> >>   874e837d67d0 rtlwifi: fill FW version and subversion
> >>   a44709bba70f rtlwifi: btcoex: Add power_on_setting routine
> >>   40d9dd4f1c5d rtlwifi: btcoex: Remove global variables from btcoex
> >
> > v4.15 isn't longterm version and had been EOL.
> >
> 
> Right, but this is a performace regression in comparison to v4.11, so
> if "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel definition" is marked
> for stable, shouldn't these other patches be brought as well? All
> releases since v4.11 are probably affected, but honestly I don't have
> a strong understanding of how the stable trees operate in situations
> like this.
> 

see below.

> >>
> >> Surprisingly, it seems forcing ant_sel=1 is not needed anymore on
> >> these machines, as the shown by the numbers bellow (ant_sel=0 means
> >> that actually no parameter was passed to the module). I have powered
> >> off the machine and done a cold boot for every test. It seems
> >> something have changed in the antenna auto-selection code since v4.11,
> >> the latest point where I could confirm we definitely need to force
> >> ant_sel=1. I've been trying to understand what causes this difference,
> >> but haven't made progress on that so far, so any suggestions are
> >> appreciated (we are trying to decide if we can confidently drop the
> >> downstream DMI quirks for these specific machines).
> >>
> > I think your rtl8723be module programed correct efuse content, so it
> > works properly with ant_sel=0, and quirk isn't required for your
> > machine.
> >
> >>   w-d-n ant_sel=0: -14.00 dBm,  69.5 Mbps -> good
> >>   w-d-n ant_sel=1: -10.00 dBm,  41.1 Mbps -> good
> >>   w-d-n ant_sel=2: -44.00 dBm,   607 kbps -> bad
> >>
> >>   v4.16 ant_sel=0: -12.00 dBm,  63.0 Mbps -> good
> >>   v4.16 ant_sel=1: - 8.00 dBm,  69.0 Mbps -> good
> >>   v4.16 ant_sel=2: -50.00 dBm,   224 kbps -> bad
> >>
> >>   v4.15 ant_sel=0: - 8.00 dBm,  33.0 Mbps -> good
> >>   v4.15 ant_sel=1: -10.00 dBm,  38.1 Mbps -> good
> >>   v4.15 ant_sel=2: -48.00 dBm,   206 kbps -> bad
> >>
> >
> > With your results, the efuse content is programmed as one or two antenna
> > on AUX path.
> >
> 
> With v4.11 I had good performance results on this very same machine
> (thus same efuse contents) only when passing ant_sel=1, so there has
> to be some change on the code that parses the efuse contents and
> decides which antenna will be used.
> 

Since btcoex control TDMA parameters for WiFi and BT, antenna related code
is put in btcoex. That's why ant_sel is used by btcoex.
In v4.12, we upgraded btcoex and firmware in order to yield better balance
between WiFi and BT, meanwhile code flow had some changes. So, the single 
commit af8a41cccf8f ("rtlwifi: cleanup 8723be ant_sel definition") won't 
work on v4.11. In other words, if you want v4.11 work properly, you need to
apply all changes of btcoex.


The parser of efuse isn't changed, and I think the reason why v4.11 needs
ant_sel=1 is the same as above.

Regards
PK


^ permalink raw reply

* Why ebt_dnat.c doesn't need to re-search fdb after changing dmac?
From: Yi Li @ 2018-05-14  2:51 UTC (permalink / raw)
  To: netdev

Hi List,
         As the subject stated, I didn't find the clue. If we add dnat
into NF_BR_LOCAL_OUT, and change the destination mac of our locally
generated packets. In theory , we need to re-search the fdb hash with
new dmac as parameter,  in order to get the right bridge port to send
packets out.
        But I can't find the re-search action, or did I missed something ?




regards,
Yi

^ permalink raw reply

* [PATCH v2] Revert "alx: remove WoL support"
From: AceLan Kao @ 2018-05-14  3:28 UTC (permalink / raw)
  To: Jay Cliburn, Chris Snook, David S . Miller, Rakesh Pandit, netdev,
	Emily Chien, Andrew Lunn, linux-kernel

This reverts commit bc2bebe8de8ed4ba6482c9cc370b0dd72ffe8cd2.

The WoL feature is a must to pass Energy Star 6.1 and above,
the power consumption will be measured during S3 with WoL is enabled.

Reverting "alx: remove WoL support", and will try to fix the unintentional
wake up issue when WoL is enabled.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=61651

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
---
 drivers/net/ethernet/atheros/alx/ethtool.c |  36 +++++
 drivers/net/ethernet/atheros/alx/hw.c      | 154 ++++++++++++++++++++-
 drivers/net/ethernet/atheros/alx/hw.h      |   5 +
 drivers/net/ethernet/atheros/alx/main.c    | 142 +++++++++++++++++--
 4 files changed, 326 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/atheros/alx/ethtool.c b/drivers/net/ethernet/atheros/alx/ethtool.c
index 2f4eabf652e8..859e27236ce4 100644
--- a/drivers/net/ethernet/atheros/alx/ethtool.c
+++ b/drivers/net/ethernet/atheros/alx/ethtool.c
@@ -310,11 +310,47 @@ static int alx_get_sset_count(struct net_device *netdev, int sset)
 	}
 }
 
+static void alx_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+	struct alx_priv *alx = netdev_priv(netdev);
+	struct alx_hw *hw = &alx->hw;
+
+	wol->supported = WAKE_MAGIC | WAKE_PHY;
+	wol->wolopts = 0;
+
+	if (hw->sleep_ctrl & ALX_SLEEP_WOL_MAGIC)
+		wol->wolopts |= WAKE_MAGIC;
+	if (hw->sleep_ctrl & ALX_SLEEP_WOL_PHY)
+		wol->wolopts |= WAKE_PHY;
+}
+
+static int alx_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+	struct alx_priv *alx = netdev_priv(netdev);
+	struct alx_hw *hw = &alx->hw;
+
+	if (wol->wolopts & ~(WAKE_MAGIC | WAKE_PHY))
+		return -EOPNOTSUPP;
+
+	hw->sleep_ctrl = 0;
+
+	if (wol->wolopts & WAKE_MAGIC)
+		hw->sleep_ctrl |= ALX_SLEEP_WOL_MAGIC;
+	if (wol->wolopts & WAKE_PHY)
+		hw->sleep_ctrl |= ALX_SLEEP_WOL_PHY;
+
+	device_set_wakeup_enable(&alx->hw.pdev->dev, hw->sleep_ctrl);
+
+	return 0;
+}
+
 const struct ethtool_ops alx_ethtool_ops = {
 	.get_pauseparam	= alx_get_pauseparam,
 	.set_pauseparam	= alx_set_pauseparam,
 	.get_msglevel	= alx_get_msglevel,
 	.set_msglevel	= alx_set_msglevel,
+	.get_wol	= alx_get_wol,
+	.set_wol	= alx_set_wol,
 	.get_link	= ethtool_op_get_link,
 	.get_strings	= alx_get_strings,
 	.get_sset_count	= alx_get_sset_count,
diff --git a/drivers/net/ethernet/atheros/alx/hw.c b/drivers/net/ethernet/atheros/alx/hw.c
index 6ac40b0003a3..f9bf612550ab 100644
--- a/drivers/net/ethernet/atheros/alx/hw.c
+++ b/drivers/net/ethernet/atheros/alx/hw.c
@@ -332,6 +332,16 @@ void alx_set_macaddr(struct alx_hw *hw, const u8 *addr)
 	alx_write_mem32(hw, ALX_STAD1, val);
 }
 
+static void alx_enable_osc(struct alx_hw *hw)
+{
+	u32 val;
+
+	/* rising edge */
+	val = alx_read_mem32(hw, ALX_MISC);
+	alx_write_mem32(hw, ALX_MISC, val & ~ALX_MISC_INTNLOSC_OPEN);
+	alx_write_mem32(hw, ALX_MISC, val | ALX_MISC_INTNLOSC_OPEN);
+}
+
 static void alx_reset_osc(struct alx_hw *hw, u8 rev)
 {
 	u32 val, val2;
@@ -774,7 +784,6 @@ int alx_setup_speed_duplex(struct alx_hw *hw, u32 ethadv, u8 flowctrl)
 	return err;
 }
 
-
 void alx_post_phy_link(struct alx_hw *hw)
 {
 	u16 phy_val, len, agc;
@@ -848,6 +857,65 @@ void alx_post_phy_link(struct alx_hw *hw)
 	}
 }
 
+/* NOTE:
+ *    1. phy link must be established before calling this function
+ *    2. wol option (pattern,magic,link,etc.) is configed before call it.
+ */
+int alx_pre_suspend(struct alx_hw *hw, int speed, u8 duplex)
+{
+	u32 master, mac, phy, val;
+	int err = 0;
+
+	master = alx_read_mem32(hw, ALX_MASTER);
+	master &= ~ALX_MASTER_PCLKSEL_SRDS;
+	mac = hw->rx_ctrl;
+	/* 10/100 half */
+	ALX_SET_FIELD(mac, ALX_MAC_CTRL_SPEED,  ALX_MAC_CTRL_SPEED_10_100);
+	mac &= ~(ALX_MAC_CTRL_FULLD | ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_TX_EN);
+
+	phy = alx_read_mem32(hw, ALX_PHY_CTRL);
+	phy &= ~(ALX_PHY_CTRL_DSPRST_OUT | ALX_PHY_CTRL_CLS);
+	phy |= ALX_PHY_CTRL_RST_ANALOG | ALX_PHY_CTRL_HIB_PULSE |
+	       ALX_PHY_CTRL_HIB_EN;
+
+	/* without any activity  */
+	if (!(hw->sleep_ctrl & ALX_SLEEP_ACTIVE)) {
+		err = alx_write_phy_reg(hw, ALX_MII_IER, 0);
+		if (err)
+			return err;
+		phy |= ALX_PHY_CTRL_IDDQ | ALX_PHY_CTRL_POWER_DOWN;
+	} else {
+		if (hw->sleep_ctrl & (ALX_SLEEP_WOL_MAGIC | ALX_SLEEP_CIFS))
+			mac |= ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_BRD_EN;
+		if (hw->sleep_ctrl & ALX_SLEEP_CIFS)
+			mac |= ALX_MAC_CTRL_TX_EN;
+		if (duplex == DUPLEX_FULL)
+			mac |= ALX_MAC_CTRL_FULLD;
+		if (speed == SPEED_1000)
+			ALX_SET_FIELD(mac, ALX_MAC_CTRL_SPEED,
+				      ALX_MAC_CTRL_SPEED_1000);
+		phy |= ALX_PHY_CTRL_DSPRST_OUT;
+		err = alx_write_phy_ext(hw, ALX_MIIEXT_ANEG,
+					ALX_MIIEXT_S3DIG10,
+					ALX_MIIEXT_S3DIG10_SL);
+		if (err)
+			return err;
+	}
+
+	alx_enable_osc(hw);
+	hw->rx_ctrl = mac;
+	alx_write_mem32(hw, ALX_MASTER, master);
+	alx_write_mem32(hw, ALX_MAC_CTRL, mac);
+	alx_write_mem32(hw, ALX_PHY_CTRL, phy);
+
+	/* set val of PDLL D3PLLOFF */
+	val = alx_read_mem32(hw, ALX_PDLL_TRNS1);
+	val |= ALX_PDLL_TRNS1_D3PLLOFF_EN;
+	alx_write_mem32(hw, ALX_PDLL_TRNS1, val);
+
+	return 0;
+}
+
 bool alx_phy_configured(struct alx_hw *hw)
 {
 	u32 cfg, hw_cfg;
@@ -920,6 +988,26 @@ int alx_clear_phy_intr(struct alx_hw *hw)
 	return alx_read_phy_reg(hw, ALX_MII_ISR, &isr);
 }
 
+int alx_config_wol(struct alx_hw *hw)
+{
+	u32 wol = 0;
+	int err = 0;
+
+	/* turn on magic packet event */
+	if (hw->sleep_ctrl & ALX_SLEEP_WOL_MAGIC)
+		wol |= ALX_WOL0_MAGIC_EN | ALX_WOL0_PME_MAGIC_EN;
+
+	/* turn on link up event */
+	if (hw->sleep_ctrl & ALX_SLEEP_WOL_PHY) {
+		wol |=  ALX_WOL0_LINK_EN | ALX_WOL0_PME_LINK;
+		/* only link up can wake up */
+		err = alx_write_phy_reg(hw, ALX_MII_IER, ALX_IER_LINK_UP);
+	}
+	alx_write_mem32(hw, ALX_WOL0, wol);
+
+	return err;
+}
+
 void alx_disable_rss(struct alx_hw *hw)
 {
 	u32 ctrl = alx_read_mem32(hw, ALX_RXQ0);
@@ -1044,6 +1132,70 @@ void alx_mask_msix(struct alx_hw *hw, int index, bool mask)
 	alx_post_write(hw);
 }
 
+int alx_select_powersaving_speed(struct alx_hw *hw, int *speed, u8 *duplex)
+{
+	int i, err;
+	u16 lpa;
+
+	err = alx_read_phy_link(hw);
+	if (err)
+		return err;
+
+	if (hw->link_speed == SPEED_UNKNOWN) {
+		*speed = SPEED_UNKNOWN;
+		*duplex = DUPLEX_UNKNOWN;
+		return 0;
+	}
+
+	err = alx_read_phy_reg(hw, MII_LPA, &lpa);
+	if (err)
+		return err;
+
+	if (!(lpa & LPA_LPACK)) {
+		*speed = hw->link_speed;
+		return 0;
+	}
+
+	if (lpa & LPA_10FULL) {
+		*speed = SPEED_10;
+		*duplex = DUPLEX_FULL;
+	} else if (lpa & LPA_10HALF) {
+		*speed = SPEED_10;
+		*duplex = DUPLEX_HALF;
+	} else if (lpa & LPA_100FULL) {
+		*speed = SPEED_100;
+		*duplex = DUPLEX_FULL;
+	} else {
+		*speed = SPEED_100;
+		*duplex = DUPLEX_HALF;
+	}
+
+	if (*speed == hw->link_speed && *duplex == hw->duplex)
+		return 0;
+	err = alx_write_phy_reg(hw, ALX_MII_IER, 0);
+	if (err)
+		return err;
+	err = alx_setup_speed_duplex(hw, alx_speed_to_ethadv(*speed, *duplex) |
+					ADVERTISED_Autoneg, ALX_FC_ANEG |
+					ALX_FC_RX | ALX_FC_TX);
+	if (err)
+		return err;
+
+	/* wait for linkup */
+	for (i = 0; i < ALX_MAX_SETUP_LNK_CYCLE; i++) {
+		msleep(100);
+
+		err = alx_read_phy_link(hw);
+		if (err < 0)
+			return err;
+		if (hw->link_speed != SPEED_UNKNOWN)
+			break;
+	}
+	if (i == ALX_MAX_SETUP_LNK_CYCLE)
+		return -ETIMEDOUT;
+
+	return 0;
+}
 
 bool alx_get_phy_info(struct alx_hw *hw)
 {
diff --git a/drivers/net/ethernet/atheros/alx/hw.h b/drivers/net/ethernet/atheros/alx/hw.h
index e42d7e0947eb..a7fb6c8d846a 100644
--- a/drivers/net/ethernet/atheros/alx/hw.h
+++ b/drivers/net/ethernet/atheros/alx/hw.h
@@ -487,6 +487,8 @@ struct alx_hw {
 	u8 flowctrl;
 	u32 adv_cfg;
 
+	u32 sleep_ctrl;
+
 	spinlock_t mdio_lock;
 	struct mdio_if_info mdio;
 	u16 phy_id[2];
@@ -549,12 +551,14 @@ void alx_reset_pcie(struct alx_hw *hw);
 void alx_enable_aspm(struct alx_hw *hw, bool l0s_en, bool l1_en);
 int alx_setup_speed_duplex(struct alx_hw *hw, u32 ethadv, u8 flowctrl);
 void alx_post_phy_link(struct alx_hw *hw);
+int alx_pre_suspend(struct alx_hw *hw, int speed, u8 duplex);
 int alx_read_phy_reg(struct alx_hw *hw, u16 reg, u16 *phy_data);
 int alx_write_phy_reg(struct alx_hw *hw, u16 reg, u16 phy_data);
 int alx_read_phy_ext(struct alx_hw *hw, u8 dev, u16 reg, u16 *pdata);
 int alx_write_phy_ext(struct alx_hw *hw, u8 dev, u16 reg, u16 data);
 int alx_read_phy_link(struct alx_hw *hw);
 int alx_clear_phy_intr(struct alx_hw *hw);
+int alx_config_wol(struct alx_hw *hw);
 void alx_cfg_mac_flowcontrol(struct alx_hw *hw, u8 fc);
 void alx_start_mac(struct alx_hw *hw);
 int alx_reset_mac(struct alx_hw *hw);
@@ -563,6 +567,7 @@ bool alx_phy_configured(struct alx_hw *hw);
 void alx_configure_basic(struct alx_hw *hw);
 void alx_mask_msix(struct alx_hw *hw, int index, bool mask);
 void alx_disable_rss(struct alx_hw *hw);
+int alx_select_powersaving_speed(struct alx_hw *hw, int *speed, u8 *duplex);
 bool alx_get_phy_info(struct alx_hw *hw);
 void alx_update_hw_stats(struct alx_hw *hw);
 
diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
index 567ee54504bc..c0e2bb22ce24 100644
--- a/drivers/net/ethernet/atheros/alx/main.c
+++ b/drivers/net/ethernet/atheros/alx/main.c
@@ -1070,6 +1070,7 @@ static int alx_init_sw(struct alx_priv *alx)
 	alx->dev->max_mtu = ALX_MAX_FRAME_LEN(ALX_MAX_FRAME_SIZE);
 	alx->tx_ringsz = 256;
 	alx->rx_ringsz = 512;
+	hw->sleep_ctrl = ALX_SLEEP_WOL_MAGIC | ALX_SLEEP_WOL_PHY;
 	hw->imt = 200;
 	alx->int_mask = ALX_ISR_MISC;
 	hw->dma_chnl = hw->max_dma_chnl;
@@ -1346,6 +1347,66 @@ static int alx_stop(struct net_device *netdev)
 	return 0;
 }
 
+static int __alx_shutdown(struct pci_dev *pdev, bool *wol_en)
+{
+	struct alx_priv *alx = pci_get_drvdata(pdev);
+	struct net_device *netdev = alx->dev;
+	struct alx_hw *hw = &alx->hw;
+	int err, speed;
+	u8 duplex;
+
+	netif_device_detach(netdev);
+
+	if (netif_running(netdev))
+		__alx_stop(alx);
+
+#ifdef CONFIG_PM_SLEEP
+	err = pci_save_state(pdev);
+	if (err)
+		return err;
+#endif
+
+	err = alx_select_powersaving_speed(hw, &speed, &duplex);
+	if (err)
+		return err;
+	err = alx_clear_phy_intr(hw);
+	if (err)
+		return err;
+	err = alx_pre_suspend(hw, speed, duplex);
+	if (err)
+		return err;
+	err = alx_config_wol(hw);
+	if (err)
+		return err;
+
+	*wol_en = false;
+	if (hw->sleep_ctrl & ALX_SLEEP_ACTIVE) {
+		netif_info(alx, wol, netdev,
+			   "wol: ctrl=%X, speed=%X\n",
+			   hw->sleep_ctrl, speed);
+		device_set_wakeup_enable(&pdev->dev, true);
+		*wol_en = true;
+	}
+
+	pci_disable_device(pdev);
+
+	return 0;
+}
+
+static void alx_shutdown(struct pci_dev *pdev)
+{
+	int err;
+	bool wol_en;
+
+	err = __alx_shutdown(pdev, &wol_en);
+	if (!err) {
+		pci_wake_from_d3(pdev, wol_en);
+		pci_set_power_state(pdev, PCI_D3hot);
+	} else {
+		dev_err(&pdev->dev, "shutdown fail %d\n", err);
+	}
+}
+
 static void alx_link_check(struct work_struct *work)
 {
 	struct alx_priv *alx;
@@ -1841,6 +1902,8 @@ static int alx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto out_unmap;
 	}
 
+	device_set_wakeup_enable(&pdev->dev, hw->sleep_ctrl);
+
 	netdev_info(netdev,
 		    "Qualcomm Atheros AR816x/AR817x Ethernet [%pM]\n",
 		    netdev->dev_addr);
@@ -1883,12 +1946,22 @@ static void alx_remove(struct pci_dev *pdev)
 static int alx_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
-	struct alx_priv *alx = pci_get_drvdata(pdev);
+	int err;
+	bool wol_en;
+
+	err = __alx_shutdown(pdev, &wol_en);
+	if (err) {
+		dev_err(&pdev->dev, "shutdown fail in suspend %d\n", err);
+		return err;
+	}
+
+	if (wol_en) {
+		pci_prepare_to_sleep(pdev);
+	} else {
+		pci_wake_from_d3(pdev, false);
+		pci_set_power_state(pdev, PCI_D3hot);
+	}
 
-	if (!netif_running(alx->dev))
-		return 0;
-	netif_device_detach(alx->dev);
-	__alx_stop(alx);
 	return 0;
 }
 
@@ -1896,23 +1969,69 @@ static int alx_resume(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct alx_priv *alx = pci_get_drvdata(pdev);
+	struct net_device *netdev = alx->dev;
 	struct alx_hw *hw = &alx->hw;
+	int err;
+
+	pci_set_power_state(pdev, PCI_D0);
+	pci_restore_state(pdev);
+	pci_save_state(pdev);
+
+	pci_enable_wake(pdev, PCI_D3hot, 0);
+	pci_enable_wake(pdev, PCI_D3cold, 0);
 
+	hw->link_speed = SPEED_UNKNOWN;
+	alx->int_mask = ALX_ISR_MISC;
+
+	alx_reset_pcie(hw);
 	alx_reset_phy(hw);
 
-	if (!netif_running(alx->dev))
-		return 0;
-	netif_device_attach(alx->dev);
-	return __alx_open(alx, true);
+	pci_set_power_state(pdev, PCI_D0);
+	pci_restore_state(pdev);
+	pci_save_state(pdev);
+
+	pci_enable_wake(pdev, PCI_D3hot, 0);
+	pci_enable_wake(pdev, PCI_D3cold, 0);
+
+	hw->link_speed = SPEED_UNKNOWN;
+	alx->int_mask = ALX_ISR_MISC;
+
+	alx_reset_pcie(hw);
+	alx_reset_phy(hw);
+
+	err = alx_reset_mac(hw);
+	if (err) {
+		netif_err(alx, hw, alx->dev,
+			  "resume:reset_mac fail %d\n", err);
+		return -EIO;
+	}
+
+	err = alx_setup_speed_duplex(hw, hw->adv_cfg, hw->flowctrl);
+	if (err) {
+		netif_err(alx, hw, alx->dev,
+			  "resume:setup_speed_duplex fail %d\n", err);
+		return -EIO;
+	}
+
+	if (netif_running(netdev)) {
+		err = __alx_open(alx, true);
+		if (err)
+			return err;
+	}
+
+	netif_device_attach(netdev);
+
+	return err;
 }
+#endif
 
+#ifdef CONFIG_PM_SLEEP
 static SIMPLE_DEV_PM_OPS(alx_pm_ops, alx_suspend, alx_resume);
 #define ALX_PM_OPS      (&alx_pm_ops)
 #else
 #define ALX_PM_OPS      NULL
 #endif
 
-
 static pci_ers_result_t alx_pci_error_detected(struct pci_dev *pdev,
 					       pci_channel_state_t state)
 {
@@ -1955,6 +2074,8 @@ static pci_ers_result_t alx_pci_error_slot_reset(struct pci_dev *pdev)
 	}
 
 	pci_set_master(pdev);
+	pci_enable_wake(pdev, PCI_D3hot, 0);
+	pci_enable_wake(pdev, PCI_D3cold, 0);
 
 	alx_reset_pcie(hw);
 	if (!alx_reset_mac(hw))
@@ -2011,6 +2132,7 @@ static struct pci_driver alx_driver = {
 	.id_table    = alx_pci_tbl,
 	.probe       = alx_probe,
 	.remove      = alx_remove,
+	.shutdown    = alx_shutdown,
 	.err_handler = &alx_err_handlers,
 	.driver.pm   = ALX_PM_OPS,
 };
-- 
2.17.0

^ permalink raw reply related

* Re: WARNING: suspicious RCU usage in tipc_bearer_find
From: Eric Biggers @ 2018-05-14  4:27 UTC (permalink / raw)
  To: syzbot
  Cc: davem, dvyukov, jon.maloy, linux-kernel, netdev, syzkaller-bugs,
	tipc-discussion, ying.xue
In-Reply-To: <94eb2c06cb284308e30564ccf9b9@google.com>

On Fri, Feb 09, 2018 at 12:00:01PM -0800, syzbot wrote:
> syzbot has found reproducer for the following crash on net-next commit
> 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +0000)
> Merge tag 'usercopy-v4.16-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
> 
> So far this crash happened 13 times on net-next, upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b743957adcee51f5e0e3@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed.
> 
> 
> audit: type=1400 audit(1518206230.395:8): avc:  denied  { create } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> =============================
> audit: type=1400 audit(1518206230.396:9): avc:  denied  { write } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> WARNING: suspicious RCU usage
> 4.15.0+ #221 Not tainted
> -----------------------------
> net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by syzkaller756462/4164:
>  #0:  (cb_lock){++++}, at: [<000000003bb01113>] genl_rcv+0x19/0x40
> net/netlink/genetlink.c:634
>  #1:  (genl_mutex){+.+.}, at: [<000000002e321e71>] genl_lock
> net/netlink/genetlink.c:33 [inline]
>  #1:  (genl_mutex){+.+.}, at: [<000000002e321e71>] genl_rcv_msg+0x115/0x140
> net/netlink/genetlink.c:622
> 
> stack backtrace:
> CPU: 0 PID: 4164 Comm: syzkaller756462 Not tainted 4.15.0+ #221
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
>  tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
>  tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
>  __tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
>  tipc_nl_compat_doit+0x15b/0x670 net/tipc/netlink_compat.c:335
>  tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
>  tipc_nl_compat_recv+0x1135/0x18f0 net/tipc/netlink_compat.c:1201
>  genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
>  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
>  netlink_rcv_skb+0x14b/0x380 net/netlink/af_netlink.c:2442
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>  netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
>  netlink_unicast+0x4c4/0x6b0 net/netlink/af_netlink.c:1334
>  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:640
>  ___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
>  __sys_sendmsg+0xe5/0x210 net/socket.c:2080
>  SYSC_sendmsg net/socket.c:2091 [inline]
>  SyS_sendmsg+0x2d/0x50 net/socket.c:2087
>  entry_SYSCALL_64_fastpath+0x29/0xa0
> RIP: 0033:0x43fd69
> RSP: 002b:00007fff09979378 EFLAGS: 00000203 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd69
> RDX: 0000000000000000 RSI: 0000000020003000 RDI: 0000000000000003
> RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000203 R12: 0000000000401690
> R13: 0000000000401720 R14: 0000000000000000 R15: 0000000
> 

This was fixed by commit ed4ffdfec26df:

#syz fix: tipc: Fix missing RTNL lock protection during setting link properties

- Eric

^ permalink raw reply

* Re: [PATCH RESEND net-next v2 1/8] dt-bindings: net: dwmac-sun8i: Clean up clock delay chain descriptions
From: Chen-Yu Tsai @ 2018-05-14  4:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: devicetree, Maxime Ripard, netdev, Rob Herring, Corentin Labbe,
	Giuseppe Cavallaro, linux-arm-kernel, Icenowy Zheng
In-Reply-To: <20180513202938.GH12738@lunn.ch>

On Sun, May 13, 2018 at 1:29 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Sun, May 13, 2018 at 01:11:08PM -0700, Chen-Yu Tsai wrote:
>> On Sun, May 13, 2018 at 1:05 PM, Andrew Lunn <andrew@lunn.ch> wrote:
>> >> > Hi Chen-Yu
>> >> >
>> >> > Are these delays the MAC applies? Not the PHY. It would be good to
>> >> > make it clear here these are MAC imposed delays.
>> >>
>> >> Yes these are applied on the MAC side. Being described in the device
>> >> tree bindings for the MAC, I thought this was implied to be the case?
>> >> Are there known exceptions?
>> >
>> > There is frequent confusion with this. Most of the time, the PHY does
>> > the delay, not the MAC, based on the phy-mode. So the MAC doing it is
>> > an exception in itself.
>> >
>> > Do you actually need these delays for the board you adding support
>> > for? Does the PHY not support adding the needed delays? If you don't
>> > need the delays, i would not even implement them.
>>
>> Yes this is already used on the Bananapi M3. This patch merely reformats
>> the description and adds a note saying this only applies to RGMII mode.
>
> Yes, the current code is needed for the Bananapi M3. But you have
> another patch which extends the code to support a smaller range. Do
> you have a board which actually needs this? If not, i would not add
> that new code.

IIRC the delay on the PHY side is either 2ns or none. The delay on the
MAC side here is an order smaller, likely fine tuning to cope with board
design deficiencies.

Currently no other board requires this, but this is already part of the
binding. The new stuff limits the range for a specific SoC, simply because
that is the range supported by the control register. Not implementing, i.e.
supporting the whole range from the property, which might get truncated,
doesn't make much sense to me.

Regards
ChenYu

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox