Netdev List
 help / color / mirror / Atom feed
* [PATCH 12/16] arm64: prefer __section from compiler_attributes.h
From: Nick Desaulniers @ 2019-08-12 21:50 UTC (permalink / raw)
  To: akpm
  Cc: sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, Nick Desaulniers, Catalin Marinas, Will Deacon,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Andrey Konovalov, Greg Kroah-Hartman, Enrico Weigelt,
	Suzuki K Poulose, Thomas Gleixner, Masayoshi Mizuma,
	Shaokun Zhang, Alexios Zavras, Allison Randal, linux-arm-kernel,
	linux-kernel, netdev, bpf
In-Reply-To: <20190812215052.71840-1-ndesaulniers@google.com>

GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.

This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 arch/arm64/include/asm/cache.h     | 2 +-
 arch/arm64/kernel/smp_spin_table.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index 64eeaa41e7ca..43da6dd29592 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -78,7 +78,7 @@ static inline u32 cache_type_cwg(void)
 	return (read_cpuid_cachetype() >> CTR_CWG_SHIFT) & CTR_CWG_MASK;
 }
 
-#define __read_mostly __attribute__((__section__(".data..read_mostly")))
+#define __read_mostly __section(.data..read_mostly)
 
 static inline int cache_line_size_of_cpu(void)
 {
diff --git a/arch/arm64/kernel/smp_spin_table.c b/arch/arm64/kernel/smp_spin_table.c
index 76c2739ba8a4..c8a3fee00c11 100644
--- a/arch/arm64/kernel/smp_spin_table.c
+++ b/arch/arm64/kernel/smp_spin_table.c
@@ -19,7 +19,7 @@
 #include <asm/smp_plat.h>
 
 extern void secondary_holding_pen(void);
-volatile unsigned long __section(".mmuoff.data.read")
+volatile unsigned long __section(.mmuoff.data.read)
 secondary_holding_pen_release = INVALID_HWID;
 
 static phys_addr_t cpu_release_addr[NR_CPUS];
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* [PATCH 15/16] include/linux/compiler.h: remove unused KENTRY macro
From: Nick Desaulniers @ 2019-08-12 21:50 UTC (permalink / raw)
  To: akpm
  Cc: sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, Nick Desaulniers, Luc Van Oostenryck,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	linux-sparse, linux-kernel, netdev, bpf
In-Reply-To: <20190812215052.71840-1-ndesaulniers@google.com>

This macro is not used throughout the kernel. Delete it rather than
update the __section to be a fully spelled out
__attribute__((__section__())) to avoid
https://bugs.llvm.org/show_bug.cgi?id=42950.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 include/linux/compiler.h | 23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 5e88e7e33abe..f01c1e527f85 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -136,29 +136,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 } while (0)
 #endif
 
-/*
- * KENTRY - kernel entry point
- * This can be used to annotate symbols (functions or data) that are used
- * without their linker symbol being referenced explicitly. For example,
- * interrupt vector handlers, or functions in the kernel image that are found
- * programatically.
- *
- * Not required for symbols exported with EXPORT_SYMBOL, or initcalls. Those
- * are handled in their own way (with KEEP() in linker scripts).
- *
- * KENTRY can be avoided if the symbols in question are marked as KEEP() in the
- * linker script. For example an architecture could KEEP() its entire
- * boot/exception vector code rather than annotate each function and data.
- */
-#ifndef KENTRY
-# define KENTRY(sym)						\
-	extern typeof(sym) sym;					\
-	static const unsigned long __kentry_##sym		\
-	__used							\
-	__section("___kentry" "+" #sym )			\
-	= (unsigned long)&sym;
-#endif
-
 #ifndef RELOC_HIDE
 # define RELOC_HIDE(ptr, off)					\
   ({ unsigned long __ptr;					\
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* [PATCH 16/16] compiler_attributes.h: add note about __section
From: Nick Desaulniers @ 2019-08-12 21:50 UTC (permalink / raw)
  To: akpm
  Cc: sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, Nick Desaulniers, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, linux-kernel, netdev,
	bpf
In-Reply-To: <20190812215052.71840-1-ndesaulniers@google.com>

The antipattern described can be found with:
$ grep -e __section\(\" -r -e __section__\(\"

Link: https://bugs.llvm.org/show_bug.cgi?id=42950
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 include/linux/compiler_attributes.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 6b318efd8a74..f8c008d7f616 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -225,6 +225,16 @@
 #define __pure                          __attribute__((__pure__))
 
 /*
+ *  Note: Since this macro makes use of the "stringification operator" `#`, a
+ *        quoted string literal should not be passed to it. eg.
+ *        prefer:
+ *        __section(.foo)
+ *        to:
+ *        __section(".foo")
+ *        unless the section name is dynamically built up, in which case the
+ *        verbose __attribute__((__section__(".foo" x))) should be preferred.
+ *        See also: https://bugs.llvm.org/show_bug.cgi?id=42950
+ *
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-section-function-attribute
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-section-variable-attribute
  * clang: https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* [PATCH 00/16] treewide: prefer __section from compiler_attributes.h
From: Nick Desaulniers @ 2019-08-12 21:50 UTC (permalink / raw)
  To: akpm
  Cc: sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, Nick Desaulniers, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, netdev, bpf
In-Reply-To: <20190812215052.71840-1-ndesaulniers@google.com>

GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.

This fixes an Oops observed in distro's that use systemd and not
net.core.bpf_jit_enable=1, when their kernels are compiled with Clang.

Instead, we should:
1. Prefer __section(.section_name_no_quotes).
2. Only use __attribute__((__section(".section"))) when creating the
section name via C preprocessor (see the definition of __define_initcall
in arch/um/include/shared/init.h).

This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r

See the discussions in:
https://bugs.llvm.org/show_bug.cgi?id=42950
https://marc.info/?l=linux-netdev&m=156412960619946&w=2

Nick Desaulniers (16):
  s390/boot: fix section name escaping
  arc: prefer __section from compiler_attributes.h
  parisc: prefer __section from compiler_attributes.h
  um: prefer __section from compiler_attributes.h
  sh: prefer __section from compiler_attributes.h
  ia64: prefer __section from compiler_attributes.h
  arm: prefer __section from compiler_attributes.h
  mips: prefer __section from compiler_attributes.h
  sparc: prefer __section from compiler_attributes.h
  powerpc: prefer __section and __printf from compiler_attributes.h
  x86: prefer __section from compiler_attributes.h
  arm64: prefer __section from compiler_attributes.h
  include/asm-generic: prefer __section from compiler_attributes.h
  include/linux: prefer __section from compiler_attributes.h
  include/linux/compiler.h: remove unused KENTRY macro
  compiler_attributes.h: add note about __section

 arch/arc/include/asm/linkage.h        |  8 +++----
 arch/arc/include/asm/mach_desc.h      |  3 +--
 arch/arm/include/asm/cache.h          |  2 +-
 arch/arm/include/asm/mach/arch.h      |  4 ++--
 arch/arm/include/asm/setup.h          |  2 +-
 arch/arm64/include/asm/cache.h        |  2 +-
 arch/arm64/kernel/smp_spin_table.c    |  2 +-
 arch/ia64/include/asm/cache.h         |  2 +-
 arch/mips/include/asm/cache.h         |  2 +-
 arch/parisc/include/asm/cache.h       |  2 +-
 arch/parisc/include/asm/ldcw.h        |  2 +-
 arch/powerpc/boot/main.c              |  3 +--
 arch/powerpc/boot/ps3.c               |  6 ++----
 arch/powerpc/include/asm/cache.h      |  2 +-
 arch/powerpc/kernel/btext.c           |  2 +-
 arch/s390/boot/startup.c              |  2 +-
 arch/sh/include/asm/cache.h           |  2 +-
 arch/sparc/include/asm/cache.h        |  2 +-
 arch/sparc/kernel/btext.c             |  2 +-
 arch/um/kernel/um_arch.c              |  6 +++---
 arch/x86/include/asm/cache.h          |  2 +-
 arch/x86/include/asm/intel-mid.h      |  2 +-
 arch/x86/include/asm/iommu_table.h    |  5 ++---
 arch/x86/include/asm/irqflags.h       |  2 +-
 arch/x86/include/asm/mem_encrypt.h    |  2 +-
 arch/x86/kernel/cpu/cpu.h             |  3 +--
 include/asm-generic/error-injection.h |  2 +-
 include/asm-generic/kprobes.h         |  5 ++---
 include/linux/cache.h                 |  6 +++---
 include/linux/compiler.h              | 31 ++++-----------------------
 include/linux/compiler_attributes.h   | 10 +++++++++
 include/linux/cpu.h                   |  2 +-
 include/linux/export.h                |  2 +-
 include/linux/init_task.h             |  4 ++--
 include/linux/interrupt.h             |  5 ++---
 include/linux/sched/debug.h           |  2 +-
 include/linux/srcutree.h              |  2 +-
 37 files changed, 62 insertions(+), 83 deletions(-)

-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply

* [PATCH 14/16] include/linux: prefer __section from compiler_attributes.h
From: Nick Desaulniers @ 2019-08-12 21:50 UTC (permalink / raw)
  To: akpm
  Cc: sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, Nick Desaulniers, Luc Van Oostenryck,
	Lai Jiangshan, Paul E. McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra (Intel), Nicholas Piggin, Jiri Kosina, Will Deacon,
	Ard Biesheuvel, Michael Ellerman, Masahiro Yamada,
	Hans Liljestrand, Elena Reshetova, David Windsor, Marc Zyngier,
	Ming Lei, Dou Liyang, Julien Thierry, Mauro Carvalho Chehab,
	Jens Axboe, linux-kernel, linux-sparse, rcu, netdev, bpf
In-Reply-To: <20190812215052.71840-1-ndesaulniers@google.com>

Link: https://github.com/ClangBuiltLinux/linux/issues/619
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 include/linux/cache.h       | 6 +++---
 include/linux/compiler.h    | 8 ++++----
 include/linux/cpu.h         | 2 +-
 include/linux/export.h      | 2 +-
 include/linux/init_task.h   | 4 ++--
 include/linux/interrupt.h   | 5 ++---
 include/linux/sched/debug.h | 2 +-
 include/linux/srcutree.h    | 2 +-
 8 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/include/linux/cache.h b/include/linux/cache.h
index 750621e41d1c..3f4df9eef1e1 100644
--- a/include/linux/cache.h
+++ b/include/linux/cache.h
@@ -28,7 +28,7 @@
  * but may get written to during init, so can't live in .rodata (via "const").
  */
 #ifndef __ro_after_init
-#define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
+#define __ro_after_init __section(.data..ro_after_init)
 #endif
 
 #ifndef ____cacheline_aligned
@@ -45,8 +45,8 @@
 
 #ifndef __cacheline_aligned
 #define __cacheline_aligned					\
-  __attribute__((__aligned__(SMP_CACHE_BYTES),			\
-		 __section__(".data..cacheline_aligned")))
+	__aligned(SMP_CACHE_BYTES)				\
+	__section(.data..cacheline_aligned)
 #endif /* __cacheline_aligned */
 
 #ifndef __cacheline_aligned_in_smp
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f0fd5636fddb..5e88e7e33abe 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -24,7 +24,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 			long ______r;					\
 			static struct ftrace_likely_data		\
 				__aligned(4)				\
-				__section("_ftrace_annotated_branch")	\
+				__section(_ftrace_annotated_branch)	\
 				______f = {				\
 				.data.func = __func__,			\
 				.data.file = __FILE__,			\
@@ -60,7 +60,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __trace_if_value(cond) ({			\
 	static struct ftrace_branch_data		\
 		__aligned(4)				\
-		__section("_ftrace_branch")		\
+		__section(_ftrace_branch)		\
 		__if_trace = {				\
 			.func = __func__,		\
 			.file = __FILE__,		\
@@ -118,7 +118,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	".popsection\n\t"
 
 /* Annotate a C jump table to allow objtool to follow the code flow */
-#define __annotate_jump_table __section(".rodata..c_jump_table")
+#define __annotate_jump_table __section(.rodata..c_jump_table)
 
 #else
 #define annotate_reachable()
@@ -298,7 +298,7 @@ unsigned long read_word_at_a_time(const void *addr)
  * visible to the compiler.
  */
 #define __ADDRESSABLE(sym) \
-	static void * __section(".discard.addressable") __used \
+	static void * __section(.discard.addressable) __used \
 		__PASTE(__addressable_##sym, __LINE__) = (void *)&sym;
 
 /**
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index fcb1386bb0d4..186bbd79d6ce 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -166,7 +166,7 @@ void cpu_startup_entry(enum cpuhp_state state);
 void cpu_idle_poll_ctrl(bool enable);
 
 /* Attach to any functions which should be considered cpuidle. */
-#define __cpuidle	__attribute__((__section__(".cpuidle.text")))
+#define __cpuidle	__section(.cpuidle.text)
 
 bool cpu_in_idle(unsigned long pc);
 
diff --git a/include/linux/export.h b/include/linux/export.h
index fd8711ed9ac4..808c1a0c2ef9 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -104,7 +104,7 @@ struct kernel_symbol {
  * discarded in the final link stage.
  */
 #define __ksym_marker(sym)	\
-	static int __ksym_marker_##sym[0] __section(".discard.ksym") __used
+	static int __ksym_marker_##sym[0] __section(.discard.ksym) __used
 
 #define __EXPORT_SYMBOL(sym, sec)				\
 	__ksym_marker(sym);					\
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6049baa5b8bc..50139505da34 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -51,12 +51,12 @@ extern struct cred init_cred;
 
 /* Attach to the init_task data structure for proper alignment */
 #ifdef CONFIG_ARCH_TASK_STRUCT_ON_STACK
-#define __init_task_data __attribute__((__section__(".data..init_task")))
+#define __init_task_data __section(.data..init_task)
 #else
 #define __init_task_data /**/
 #endif
 
 /* Attach to the thread_info data structure for proper alignment */
-#define __init_thread_info __attribute__((__section__(".data..init_thread_info")))
+#define __init_thread_info __section(.data..init_thread_info)
 
 #endif
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5b8328a99b2a..29debfe4dd0f 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -741,8 +741,7 @@ extern int arch_early_irq_init(void);
 /*
  * We want to know which function is an entrypoint of a hardirq or a softirq.
  */
-#define __irq_entry		 __attribute__((__section__(".irqentry.text")))
-#define __softirq_entry  \
-	__attribute__((__section__(".softirqentry.text")))
+#define __irq_entry	__section(.irqentry.text)
+#define __softirq_entry	__section(.softirqentry.text)
 
 #endif
diff --git a/include/linux/sched/debug.h b/include/linux/sched/debug.h
index 95fb9e025247..e17b66221fdd 100644
--- a/include/linux/sched/debug.h
+++ b/include/linux/sched/debug.h
@@ -42,7 +42,7 @@ extern void proc_sched_set_task(struct task_struct *p);
 #endif
 
 /* Attach to any functions which should be ignored in wchan output. */
-#define __sched		__attribute__((__section__(".sched.text")))
+#define __sched		__section(.sched.text)
 
 /* Linker adds these: start and end of __sched functions */
 extern char __sched_text_start[], __sched_text_end[];
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 9cfcc8a756ae..9de652f4e1bd 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -124,7 +124,7 @@ struct srcu_struct {
 # define __DEFINE_SRCU(name, is_static)					\
 	is_static struct srcu_struct name;				\
 	struct srcu_struct * const __srcu_struct_##name			\
-		__section("___srcu_struct_ptrs") = &name
+		__section(___srcu_struct_ptrs) = &name
 #else
 # define __DEFINE_SRCU(name, is_static)					\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);	\
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* Re: WARNING in aa_sock_msg_perm
From: David Howells @ 2019-08-12 22:00 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: dhowells, syzbot, linux-kernel, netdev, syzkaller-bugs, linux-afs
In-Reply-To: <7e84e076-7096-028f-b49d-29160aea0831@I-love.SAKURA.ne.jp>

Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> This is not AppArmor's bug. LSM modules expect that "struct socket" is not
> NULL.  For some reason, peer->local->socket became NULL. Thus, suspecting
> rxrpc's bug.
> 
> >  rxrpc_send_keepalive+0x1ff/0x940 net/rxrpc/output.c:656

I agree.  There's a further refcounting bug in the local object handling, but
it's proving annoyingly difficult to reliably reproduce.

David

^ permalink raw reply

* Re: [PATCH 09/16] sparc: prefer __section from compiler_attributes.h
From: David Miller @ 2019-08-12 22:13 UTC (permalink / raw)
  To: ndesaulniers
  Cc: akpm, sedat.dilek, jpoimboe, yhs, miguel.ojeda.sandonis,
	clang-built-linux, ast, daniel, kafai, songliubraving, sparclinux,
	linux-kernel, netdev, bpf
In-Reply-To: <20190812215052.71840-9-ndesaulniers@google.com>

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Mon, 12 Aug 2019 14:50:42 -0700

> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [net-next 01/15] ice: Implement ethtool ops for channels
From: Jakub Kicinski @ 2019-08-12 22:24 UTC (permalink / raw)
  To: Nguyen, Anthony L
  Cc: Kirsher, Jeffrey T, nhorman@redhat.com, davem@davemloft.net,
	netdev@vger.kernel.org, Bowers, AndrewX, sassmann@redhat.com,
	Tieman, Henry W
In-Reply-To: <8a72e5d0ee26743dc5a896a426a55e6e9660f4d2.camel@intel.com>

On Mon, 12 Aug 2019 15:07:09 +0000, Nguyen, Anthony L wrote:
> On Fri, 2019-08-09 at 14:15 -0700, Jakub Kicinski wrote:
> > On Fri,  9 Aug 2019 11:31:25 -0700, Jeff Kirsher wrote:  
> > > From: Henry Tieman <henry.w.tieman@intel.com>
> > > 
> > > Add code to query and set the number of queues on the primary
> > > VSI for a PF. This is accessed from the 'ethtool -l' and 'ethtool
> > > -L'
> > > commands, respectively.
> > > 
> > > Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
> > > Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
> > > Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
> > > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>  
> > 
> > If you're using the same IRQ vector for RX and TX queue the channel
> > counts as combined. Looks like you are counting RX and TX separately
> > here. That's incorrect.  
> 
> Hi Jakub,
> 
> The ice driver can support asymmetric queues.  We report these
> seperately, as opposed to combined, so that the user can specify a
> different number of Rx and Tx queues.

If you have 20 IRQ vectors, 10 TX queues and 20 RX queues, the first 10
RX queues share a IRQ vector with TX queues the ethool API counts them
as 10 combined and 10 rx-only. 

10 tx-only and 20 rx-only would require 30 IRQ vectors.

^ permalink raw reply

* Re: BUG: corrupted list in rxrpc_local_processor
From: syzbot @ 2019-08-12 22:32 UTC (permalink / raw)
  To: arvid.brodin, davem, dhowells, dirk.vandermerwe, edumazet,
	jakub.kicinski, jiri, john.hurley, linux-afs, linux-kernel,
	netdev, oss-drivers, syzkaller-bugs
In-Reply-To: <000000000000492086058fad2979@google.com>

syzbot has bisected this bug to:

commit 427545b3046326cd7b4dbbd7869f08737df2ad2b
Author: Jakub Kicinski <jakub.kicinski@netronome.com>
Date:   Tue Jul 9 02:53:12 2019 +0000

     nfp: tls: count TSO segments separately for the TLS offload

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11d04eee600000
start commit:   125b7e09 net: tc35815: Explicitly check NET_IP_ALIGN is no..
git tree:       net
final crash:    https://syzkaller.appspot.com/x/report.txt?x=13d04eee600000
console output: https://syzkaller.appspot.com/x/log.txt?x=15d04eee600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
dashboard link: https://syzkaller.appspot.com/bug?extid=193e29e9387ea5837f1d
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=159d4eba600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16ba194a600000

Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
Fixes: 427545b30463 ("nfp: tls: count TSO segments separately for the TLS  
offload")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: KASAN: use-after-free Read in rxrpc_queue_local
From: David Howells @ 2019-08-12 22:38 UTC (permalink / raw)
  To: syzbot; +Cc: dhowells, davem, linux-afs, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <0000000000007593f4058fea60d8@google.com>

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git 03a62469fffcbd535d85e42ef25ba098262e9d72

^ permalink raw reply

* Re: BUG: corrupted list in rxrpc_local_processor
From: Jakub Kicinski @ 2019-08-12 22:40 UTC (permalink / raw)
  To: syzbot
  Cc: arvid.brodin, davem, dhowells, dirk.vandermerwe, edumazet, jiri,
	john.hurley, linux-afs, linux-kernel, netdev, oss-drivers,
	syzkaller-bugs
In-Reply-To: <000000000000ac9048058ff3176e@google.com>

On Mon, 12 Aug 2019 15:32:00 -0700, syzbot wrote:
> syzbot has bisected this bug to:
> 
> commit 427545b3046326cd7b4dbbd7869f08737df2ad2b
> Author: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date:   Tue Jul 9 02:53:12 2019 +0000
> 
>      nfp: tls: count TSO segments separately for the TLS offload
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11d04eee600000
> start commit:   125b7e09 net: tc35815: Explicitly check NET_IP_ALIGN is no..
> git tree:       net
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=13d04eee600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=15d04eee600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> dashboard link: https://syzkaller.appspot.com/bug?extid=193e29e9387ea5837f1d
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=159d4eba600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16ba194a600000
> 
> Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
> Fixes: 427545b30463 ("nfp: tls: count TSO segments separately for the TLS  
> offload")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Is there a way perhaps to tell syzbot to discard clearly bogus
bisection results?

^ permalink raw reply

* Re: BUG: corrupted list in rxrpc_local_processor
From: David Howells @ 2019-08-12 22:41 UTC (permalink / raw)
  To: syzbot; +Cc: dhowells, davem, linux-afs, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <000000000000492086058fad2979@google.com>

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git 03a62469fffcbd535d85e42ef25ba098262e9d72

^ permalink raw reply

* [PATCH net] ipv6: Fix return value of ipv6_mc_may_pull() for malformed packets
From: Stefano Brivio @ 2019-08-12 22:46 UTC (permalink / raw)
  To: David Miller
  Cc: Guillaume Nault, Hangbin Liu, Eric Dumazet, Linus Lüssing,
	netdev

Commit ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and
ipv6_mc_check_mld() calls") replaces direct calls to pskb_may_pull()
in br_ipv6_multicast_mld2_report() with calls to ipv6_mc_may_pull(),
that returns -EINVAL on buffers too short to be valid IPv6 packets,
while maintaining the previous handling of the return code.

This leads to the direct opposite of the intended effect: if the
packet is malformed, -EINVAL evaluates as true, and we'll happily
proceed with the processing.

Return 0 if the packet is too short, in the same way as this was
fixed for IPv4 by commit 083b78a9ed64 ("ip: fix ip_mc_may_pull()
return value").

I don't have a reproducer for this, unlike the one referred to by
the IPv4 commit, but this is clearly broken.

Fixes: ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 include/net/addrconf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index becdad576859..3f62b347b04a 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -206,7 +206,7 @@ static inline int ipv6_mc_may_pull(struct sk_buff *skb,
 				   unsigned int len)
 {
 	if (skb_transport_offset(skb) + ipv6_transport_len(skb) < len)
-		return -EINVAL;
+		return 0;
 
 	return pskb_may_pull(skb, len);
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: Stefano Brivio @ 2019-08-12 22:58 UTC (permalink / raw)
  To: David Miller; +Cc: dsahern, liuhangbin, netdev, mleitner
In-Reply-To: <20190811.204918.777837587917672157.davem@davemloft.net>

On Sun, 11 Aug 2019 20:49:18 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: David Ahern <dsahern@gmail.com>
> Date: Thu, 1 Aug 2019 22:16:00 -0600
> 
> > On 8/1/19 10:13 PM, Hangbin Liu wrote:  
> >> On Thu, Aug 01, 2019 at 01:51:25PM -0600, David Ahern wrote:  
> >>> On 8/1/19 2:29 AM, Hangbin Liu wrote:  
> >>>> Jianlin reported a bug that for IPv4, ip route get from src_addr would fail
> >>>> if src_addr is not an address on local system.
> >>>>
> >>>> \# ip route get 1.1.1.1 from 2.2.2.2
> >>>> RTNETLINK answers: Invalid argument  
> >>>
> >>> so this is a forwarding lookup in which case iif should be set. Based on  
> >> 
> >> with out setting iif in userspace, the kernel set iif to lo by default.  
> > 
> > right, it presumes locally generated traffic.  
> >>   
> >>> the above 'route get' inet_rtm_getroute is doing a lookup as if it is
> >>> locally generated traffic.  
> >> 
> >> yeah... but what about the IPv6 part. That cause a different behavior in
> >> userspace.  
> > 
> > just one of many, many annoying differences between v4 and v6. We could
> > try to catalog it.  
> 
> I think we just have to accept this difference because this change would
> change behavior for all route lookups, not just those done by ip route get.

How so, actually? I don't see how that would happen. On the forwarding
path, 'iif' is set (not to loopback interface), so that's not affected.

Is there any other route lookup possibility I'm missing?

-- 
Stefano

^ permalink raw reply

* Re: tun: mark small packets as owned by the tap sock
From: Dave Jones @ 2019-08-12 22:19 UTC (permalink / raw)
  To: Alexis Bauvin; +Cc: netdev
In-Reply-To: <git-mailbomb-linux-master-4b663366246be1d1d4b1b8b01245b2e88ad9e706@kernel.org>

On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
 > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
 > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Author:     Alexis Bauvin <abauvin@scaleway.com>
 > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
 > 
 >     tun: mark small packets as owned by the tap sock
 >     
 >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
 >     
 >     Small packets going out of a tap device go through an optimized code
 >     path that uses build_skb() rather than sock_alloc_send_pskb(). The
 >     latter calls skb_set_owner_w(), but the small packet code path does not.
 >     
 >     The net effect is that small packets are not owned by the userland
 >     application's socket (e.g. QEMU), while large packets are.
 >     This can be seen with a TCP session, where packets are not owned when
 >     the window size is small enough (around PAGE_SIZE), while they are once
 >     the window grows (note that this requires the host to support virtio
 >     tso for the guest to offload segmentation).
 >     All this leads to inconsistent behaviour in the kernel, especially on
 >     netfilter modules that uses sk->socket (e.g. xt_owner).
 >     
 >     Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
 >     Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
 >     Acked-by: Jason Wang <jasowang@redhat.com>

This commit breaks ipv6 routing when I deployed on it a linode.
It seems to work briefly after boot, and then silently all packets get
dropped. (Presumably, it's dropping RA or ND packets)

With this reverted, everything works as it did in rc3.

	Dave


^ permalink raw reply

* Re: [PATCH net] ipv6: Fix return value of ipv6_mc_may_pull() for malformed packets
From: Guillaume Nault @ 2019-08-12 23:08 UTC (permalink / raw)
  To: Stefano Brivio
  Cc: David Miller, Hangbin Liu, Eric Dumazet, Linus Lüssing,
	netdev
In-Reply-To: <dc0d0b1bc3c67e2a1346b0dd1f68428eb956fbb7.1565649789.git.sbrivio@redhat.com>

On Tue, Aug 13, 2019 at 12:46:01AM +0200, Stefano Brivio wrote:
> Commit ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and
> ipv6_mc_check_mld() calls") replaces direct calls to pskb_may_pull()
> in br_ipv6_multicast_mld2_report() with calls to ipv6_mc_may_pull(),
> that returns -EINVAL on buffers too short to be valid IPv6 packets,
> while maintaining the previous handling of the return code.
> 
> This leads to the direct opposite of the intended effect: if the
> packet is malformed, -EINVAL evaluates as true, and we'll happily
> proceed with the processing.
> 
> Return 0 if the packet is too short, in the same way as this was
> fixed for IPv4 by commit 083b78a9ed64 ("ip: fix ip_mc_may_pull()
> return value").
> 
> I don't have a reproducer for this, unlike the one referred to by
> the IPv4 commit, but this is clearly broken.
> 
> Fixes: ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  include/net/addrconf.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index becdad576859..3f62b347b04a 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -206,7 +206,7 @@ static inline int ipv6_mc_may_pull(struct sk_buff *skb,
>  				   unsigned int len)
>  {
>  	if (skb_transport_offset(skb) + ipv6_transport_len(skb) < len)
> -		return -EINVAL;
> +		return 0;
>  
>  	return pskb_may_pull(skb, len);
>  }

Acked-by: Guillaume Nault <gnault@redhat.com>

^ permalink raw reply

* Re: [PATCH v1] dt-bindings: fec: explicitly mark deprecated properties
From: Rob Herring @ 2019-08-12 23:12 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Fugang Duan, Mark Rutland, David S . Miller, netdev, devicetree,
	linux-kernel, Andrew Lunn, Fabio Estevam, Lucas Stach
In-Reply-To: <20190718201453.13062-1-TheSven73@gmail.com>

On Thu, 18 Jul 2019 16:14:53 -0400, Sven Van Asbroeck wrote:
> fec's gpio phy reset properties have been deprecated.
> Update the dt-bindings documentation to explicitly mark
> them as such, and provide a short description of the
> recommended alternative.
> 
> Signed-off-by: Sven Van Asbroeck <TheSven73@gmail.com>
> ---
>  .../devicetree/bindings/net/fsl-fec.txt       | 30 +++++++++++--------
>  1 file changed, 17 insertions(+), 13 deletions(-)
> 

Applied, thanks.

Rob

^ permalink raw reply

* [bpf-next] selftests/bpf: fix race in flow dissector tests
From: Petar Penkov @ 2019-08-12 23:30 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, sdf, Petar Penkov

From: Petar Penkov <ppenkov@google.com>

Since the "last_dissection" map holds only the flow keys for the most
recent packet, there is a small race in the skb-less flow dissector
tests if a new packet comes between transmitting the test packet, and
reading its keys from the map. If this happens, the test packet keys
will be overwritten and the test will fail.

Changing the "last_dissection" map to a hash map, keyed on the
source/dest port pair resolves this issue. Additionally, let's clear the
last test results from the map between tests to prevent previous test
cases from interfering with the following test cases.

Fixes: 0905beec9f52 ("selftests/bpf: run flow dissector tests in skb-less mode")
Signed-off-by: Petar Penkov <ppenkov@google.com>
---
 .../selftests/bpf/prog_tests/flow_dissector.c | 22 ++++++++++++++++++-
 tools/testing/selftests/bpf/progs/bpf_flow.c  | 13 +++++------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index 700d73d2f22a..6892b88ae065 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -109,6 +109,8 @@ struct test tests[] = {
 			.iph.protocol = IPPROTO_TCP,
 			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN,
@@ -116,6 +118,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IP,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -125,6 +129,8 @@ struct test tests[] = {
 			.iph.nexthdr = IPPROTO_TCP,
 			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN,
@@ -132,6 +138,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IPV6,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -143,6 +151,8 @@ struct test tests[] = {
 			.iph.protocol = IPPROTO_TCP,
 			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN + VLAN_HLEN,
@@ -150,6 +160,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IP,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -161,6 +173,8 @@ struct test tests[] = {
 			.iph.nexthdr = IPPROTO_TCP,
 			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN + VLAN_HLEN * 2,
@@ -169,6 +183,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IPV6,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -487,7 +503,8 @@ void test_flow_dissector(void)
 			BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
 		struct bpf_prog_test_run_attr tattr = {};
 		struct bpf_flow_keys flow_keys = {};
-		__u32 key = 0;
+		__u32 key = (__u32)(tests[i].keys.sport) << 16 |
+			    tests[i].keys.dport;
 
 		/* For skb-less case we can't pass input flags; run
 		 * only the tests that have a matching set of flags.
@@ -504,6 +521,9 @@ void test_flow_dissector(void)
 
 		CHECK_ATTR(err, tests[i].name, "skb-less err %d\n", err);
 		CHECK_FLOW_KEYS(tests[i].name, flow_keys, tests[i].keys);
+
+		err = bpf_map_delete_elem(keys_fd, &key);
+		CHECK_ATTR(err, tests[i].name, "bpf_map_delete_elem %d\n", err);
 	}
 
 	bpf_prog_detach(prog_fd, BPF_FLOW_DISSECTOR);
diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
index 08bd8b9d58d0..040a44206f29 100644
--- a/tools/testing/selftests/bpf/progs/bpf_flow.c
+++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
@@ -65,8 +65,8 @@ struct {
 } jmp_table SEC(".maps");
 
 struct {
-	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, 1);
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 1024);
 	__type(key, __u32);
 	__type(value, struct bpf_flow_keys);
 } last_dissection SEC(".maps");
@@ -74,12 +74,11 @@ struct {
 static __always_inline int export_flow_keys(struct bpf_flow_keys *keys,
 					    int ret)
 {
-	struct bpf_flow_keys *val;
-	__u32 key = 0;
+	__u32 key = (__u32)(keys->sport) << 16 | keys->dport;
+	struct bpf_flow_keys val;
 
-	val = bpf_map_lookup_elem(&last_dissection, &key);
-	if (val)
-		memcpy(val, keys, sizeof(*val));
+	memcpy(&val, keys, sizeof(val));
+	bpf_map_update_elem(&last_dissection, &key, &val, BPF_ANY);
 	return ret;
 }
 
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* Re: [PATCH 1/3] macb: bindings doc: update sifive fu540-c000 binding
From: Rob Herring @ 2019-08-12 23:32 UTC (permalink / raw)
  To: Yash Shah
  Cc: davem, robh+dt, paul.walmsley, netdev, devicetree, linux-kernel,
	linux-riscv, mark.rutland, palmer, aou, nicolas.ferre, ynezz,
	sachin.ghadi, Yash Shah
In-Reply-To: <1563534631-15897-1-git-send-email-yash.shah@sifive.com>

On Fri, 19 Jul 2019 16:40:29 +0530, Yash Shah wrote:
> As per the discussion with Nicolas Ferre, rename the compatible property
> to a more appropriate and specific string.
> LINK: https://lkml.org/lkml/2019/7/17/200
> 
> Signed-off-by: Yash Shah <yash.shah@sifive.com>
> ---
>  Documentation/devicetree/bindings/net/macb.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* Re: [PATCH 3/3] riscv: dts: Add DT node for SiFive FU540 Ethernet controller driver
From: Rob Herring @ 2019-08-12 23:33 UTC (permalink / raw)
  To: Paul Walmsley
  Cc: Yash Shah, davem, sagar.kadam, netdev, devicetree, linux-kernel,
	linux-riscv, mark.rutland, palmer, aou, nicolas.ferre, ynezz,
	sachin.ghadi, andrew
In-Reply-To: <alpine.DEB.2.21.9999.1907221446340.5793@viisi.sifive.com>

On Mon, Jul 22, 2019 at 02:48:40PM -0700, Paul Walmsley wrote:
> On Fri, 19 Jul 2019, Yash Shah wrote:
> 
> > DT node for SiFive FU540-C000 GEMGXL Ethernet controller driver added
> > 
> > Signed-off-by: Yash Shah <yash.shah@sifive.com>
> 
> Thanks, queuing this one for v5.3-rc with Andrew's suggested change to 
> change phy1 to phy0.
> 
> Am assuming patches 1 and 2 will go in via -net.

I don't think that has happened.

Rob

^ permalink raw reply

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: David Ahern @ 2019-08-13  0:23 UTC (permalink / raw)
  To: Stefano Brivio, David Miller; +Cc: liuhangbin, netdev, mleitner
In-Reply-To: <20190813005830.41f92428@redhat.com>

On 8/12/19 4:58 PM, Stefano Brivio wrote:
> How so, actually? I don't see how that would happen. On the forwarding
> path, 'iif' is set (not to loopback interface), so that's not affected.
> 
> Is there any other route lookup possibility I'm missing?

Use case is saddr is set and FLOWI_FLAG_ANYSRC is not set and that seems
pretty common to me. From a quick look, icmp_route_lookup,
ipv4_update_pmtu, ipv4_redirect, inet_csk_route_req, ...

Enable trace_fib_table_lookup and look at the flags for various use cases.

^ permalink raw reply

* Re: [patch net-next v3 0/3] net: devlink: Finish network namespace support
From: David Ahern @ 2019-08-13  0:24 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem, jakub.kicinski, stephen, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>

On 8/12/19 7:47 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Devlink from the beginning counts with network namespaces, but the
> instances has been fixed to init_net. The first patch allows user
> to move existing devlink instances into namespaces:
> 
> $ devlink dev
> netdevsim/netdevsim1
> $ ip netns add ns1
> $ devlink dev set netdevsim/netdevsim1 netns ns1
> $ devlink -N ns1 dev
> netdevsim/netdevsim1
> 
> The last patch allows user to create new netdevsim instance directly
> inside network namespace of a caller.

The namespace behavior seems odd to me. If devlink instance is created
in a namespace and never moved, it should die with the namespace. With
this patch set, devlink instance and its ports are moved to init_net on
namespace delete.

The fib controller needs an update to return the namespace of the
devlink instance (on top of the patch applied to net):

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 89795071f085..fa7e876f2d3b 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -114,11 +114,6 @@ static void nsim_dev_port_debugfs_exit(struct
nsim_dev_port *nsim_dev_port)
        debugfs_remove_recursive(nsim_dev_port->ddir);
 }

-static struct net *nsim_devlink_net(struct devlink *devlink)
-{
-       return &init_net;
-}
-
 static u64 nsim_dev_ipv4_fib_resource_occ_get(void *priv)
 {
        struct net *net = priv;
@@ -154,7 +149,7 @@ static int nsim_dev_resources_register(struct
devlink *devlink)
                .size_granularity = 1,
                .unit = DEVLINK_RESOURCE_UNIT_ENTRY
        };
-       struct net *net = nsim_devlink_net(devlink);
+       struct net *net = devlink_net(devlink);
        int err;
        u64 n;

@@ -309,7 +304,7 @@ static int nsim_dev_reload(struct devlink *devlink,
                NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
                NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES
        };
-       struct net *net = nsim_devlink_net(devlink);
+       struct net *net = devlink_net(devlink);
        int i;

        for (i = 0; i < ARRAY_SIZE(res_ids); ++i) {


^ permalink raw reply related

* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: David Ahern @ 2019-08-13  0:29 UTC (permalink / raw)
  To: Jakub Kicinski, Roopa Prabhu
  Cc: Jiri Pirko, netdev, David Miller, Stephen Hemminger, dcbw,
	Michal Kubecek, Andrew Lunn, parav, Saeed Mahameed, mlxsw
In-Reply-To: <20190812144310.442869de@cakuba.netronome.com>

On 8/12/19 3:43 PM, Jakub Kicinski wrote:
> Is not adding commands better because it's easier to deal with the
> RTM_NEWLINK notification? I must say it's unclear from the thread why
> muxing the op through RTM_SETLINK is preferable. IMHO new op is
> cleaner, do we have precedent for such IFLA_.*_OP-style attributes?

An alternative name for a link is not a primary object; it is only an
attribute of a link and links are manipulated through RTM_*LINK commands.

^ permalink raw reply

* Re: KASAN: use-after-free Read in rxrpc_queue_local
From: syzbot @ 2019-08-13  0:36 UTC (permalink / raw)
  To: davem, dhowells, linux-afs, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <4694.1565649521@warthog.procyon.org.uk>

Hello,

syzbot has tested the proposed patch but the reproducer still triggered  
crash:
KASAN: use-after-free Read in rxrpc_queue_local

==================================================================
BUG: KASAN: use-after-free in atomic_read  
include/asm-generic/atomic-instrumented.h:26 [inline]
BUG: KASAN: use-after-free in rxrpc_queue_local+0x7c/0x3e0  
net/rxrpc/local_object.c:354
Read of size 4 at addr ffff888081e3db14 by task syz-executor.5/31180

CPU: 0 PID: 31180 Comm: syz-executor.5 Not tainted 5.3.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
  __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
  kasan_report+0x12/0x17 mm/kasan/common.c:612
  check_memory_region_inline mm/kasan/generic.c:185 [inline]
  check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
  __kasan_check_read+0x11/0x20 mm/kasan/common.c:92
  atomic_read include/asm-generic/atomic-instrumented.h:26 [inline]
  rxrpc_queue_local+0x7c/0x3e0 net/rxrpc/local_object.c:354
  rxrpc_unuse_local+0x52/0x80 net/rxrpc/local_object.c:408
  rxrpc_release_sock net/rxrpc/af_rxrpc.c:904 [inline]
  rxrpc_release+0x47d/0x840 net/rxrpc/af_rxrpc.c:930
  __sock_release+0xce/0x280 net/socket.c:590
  sock_close+0x1e/0x30 net/socket.c:1268
  __fput+0x2ff/0x890 fs/file_table.c:280
  ____fput+0x16/0x20 fs/file_table.c:313
  task_work_run+0x145/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x316/0x380 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x5a9/0x6a0 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48  
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48  
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fffc45736d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000413511
RDX: 0000001b33920000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000001 R08: ffffffffffffffff R09: ffffffffffffffff
R10: 00007fffc45737b0 R11: 0000000000000293 R12: 000000000075bf20
R13: 000000000003624a R14: 0000000000760270 R15: ffffffffffffffff

Allocated by task 31182:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:460
  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:501
  kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3550
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
  rxrpc_lookup_local+0x562/0x1ba0 net/rxrpc/local_object.c:277
  rxrpc_bind+0x34d/0x5e0 net/rxrpc/af_rxrpc.c:149
  __sys_bind+0x239/0x290 net/socket.c:1647
  __do_sys_bind net/socket.c:1658 [inline]
  __se_sys_bind net/socket.c:1656 [inline]
  __x64_sys_bind+0x73/0xb0 net/socket.c:1656
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/common.c:449
  kasan_slab_free+0xe/0x10 mm/kasan/common.c:457
  __cache_free mm/slab.c:3425 [inline]
  kfree+0x10a/0x2c0 mm/slab.c:3756
  rxrpc_local_rcu+0x62/0x80 net/rxrpc/local_object.c:495
  __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
  rcu_do_batch kernel/rcu/tree.c:2114 [inline]
  rcu_core+0x67f/0x1580 kernel/rcu/tree.c:2314
  rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
  __do_softirq+0x262/0x98c kernel/softirq.c:292

The buggy address belongs to the object at ffff888081e3db00
  which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 20 bytes inside of
  1024-byte region [ffff888081e3db00, ffff888081e3df00)
The buggy address belongs to the page:
page:ffffea0002078f00 refcount:1 mapcount:0 mapping:ffff8880aa400c40  
index:0x0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea0002073288 ffffea00022f1608 ffff8880aa400c40
raw: 0000000000000000 ffff888081e3c000 0000000100000007 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff888081e3da00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff888081e3da80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff888081e3db00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
  ffff888081e3db80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888081e3dc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


Tested on:

commit:         03a62469 rxrpc: Fix local endpoint replacement
git tree:        
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14500d36600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)


^ permalink raw reply

* [PATCH] tools: bpftool: add feature check for zlib
From: Peter Wu @ 2019-08-13  0:38 UTC (permalink / raw)
  To: Daniel Borkmann, Jakub Kicinski
  Cc: Stanislav Fomichev, Alexei Starovoitov, netdev, Quentin Monnet

bpftool requires libelf, and zlib for decompressing /proc/config.gz.
zlib is a transitive dependency via libelf, and became mandatory since
elfutils 0.165 (Jan 2016). The feature check of libelf is already done
in the elfdep target of tools/lib/bpf/Makefile, pulled in by bpftool via
a dependency on libbpf.a. Add a similar feature check for zlib.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
Hi,

This is a follow-up for an earlier "tools: bpftool: fix reading from
/proc/config.gz" patch. It applies Jakub and Daniel suggestions from:
https://lkml.kernel.org/r/6154af6c-4f24-4b0a-25c2-a8a1d6c9948f@iogearbox.net
https://lkml.kernel.org/r/20190809140956.24369b00@cakuba.netronome.com

Feel free to massage the commit message and patch as you see fit.

Kind regards,
Peter
---
 tools/bpf/bpftool/Makefile | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 078bd0dcfba5..4c9d1ffc3fc7 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -58,8 +58,8 @@ INSTALL ?= install
 RM ?= rm -f
 
 FEATURE_USER = .bpftool
-FEATURE_TESTS = libbfd disassembler-four-args reallocarray
-FEATURE_DISPLAY = libbfd disassembler-four-args
+FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib
+FEATURE_DISPLAY = libbfd disassembler-four-args zlib
 
 check_feat := 1
 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
@@ -111,6 +111,8 @@ OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
 	$(QUIET_CC)$(COMPILE.c) -MMD -o $@ $<
 
+$(OUTPUT)feature.o: | zdep
+
 $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
 	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS)
 
@@ -149,6 +151,9 @@ doc-uninstall:
 
 FORCE:
 
-.PHONY: all FORCE clean install uninstall
+zdep:
+	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
+
+.PHONY: all FORCE clean install uninstall zdep
 .PHONY: doc doc-clean doc-install doc-uninstall
 .DEFAULT_GOAL := all
-- 
2.22.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox