Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] kconfig: remove silentoldconfig target
From: Masahiro Yamada @ 2018-10-30 16:00 UTC (permalink / raw)
  To: Linux Kbuild mailing list
  Cc: David S. Miller, open list:DOCUMENTATION, Networking,
	Jonathan Corbet, Jeff Kirsher, intel-wired-lan,
	Linux Kernel Mailing List
In-Reply-To: <1540827688-15999-2-git-send-email-yamada.masahiro@socionext.com>

On Tue, Oct 30, 2018 at 12:43 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> As commit 911a91c39cab ("kconfig: rename silentoldconfig to
> syncconfig") announced, it is time for the removal.
>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---

Applied to linux-kbuild.



>  Documentation/networking/ice.rst | 2 +-
>  scripts/kconfig/Makefile         | 9 +--------
>  2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/networking/ice.rst b/Documentation/networking/ice.rst
> index 1e4948c..4d118b8 100644
> --- a/Documentation/networking/ice.rst
> +++ b/Documentation/networking/ice.rst
> @@ -20,7 +20,7 @@ Enabling the driver
>  The driver is enabled via the standard kernel configuration system,
>  using the make command::
>
> -  make oldconfig/silentoldconfig/menuconfig/etc.
> +  make oldconfig/menuconfig/etc.
>
>  The driver is located in the menu structure at:
>
> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
> index 5d37a60..63b6092 100644
> --- a/scripts/kconfig/Makefile
> +++ b/scripts/kconfig/Makefile
> @@ -68,14 +68,7 @@ PHONY += $(simple-targets)
>  $(simple-targets): $(obj)/conf
>         $< $(silent) --$@ $(Kconfig)
>
> -PHONY += silentoldconfig savedefconfig defconfig
> -
> -# We do not expect manual invokcation of "silentoldcofig" (or "syncconfig").
> -silentoldconfig: syncconfig
> -       @echo "  WARNING: \"silentoldconfig\" has been renamed to \"syncconfig\""
> -       @echo "            and is now an internal implementation detail."
> -       @echo "            What you want is probably \"oldconfig\"."
> -       @echo "            \"silentoldconfig\" will be removed after Linux 4.19"
> +PHONY += savedefconfig defconfig
>
>  savedefconfig: $(obj)/conf
>         $< $(silent) --$@=defconfig $(Kconfig)
> --
> 2.7.4
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply

* [PATCH net] net/mlx4_en: add a missing <net/ip.h> include
From: Eric Dumazet @ 2018-10-30  7:18 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Tariq Toukan, Abdul Haleem, Eric Dumazet

Abdul Haleem reported a build error on ppc :

drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: `struct
iphdr` declared inside parameter list [enabled by default]
           struct iphdr *iph)
                  ^
drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is
only this definition or declaration, which is probably not what you want
[enabled by default]
drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function
get_fixed_ipv4_csum:
drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing
pointer to incomplete type
  __u8 ipproto = iph->protocol;
                    ^

Fixes: 55469bc6b577 ("drivers: net: remove <net/busy_poll.h> inclusion when not needed")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 5a6d0919533d6e0e619927abd753c5d07ed95dac..db00bf1c23f5ad31d64652ddc8bee32e2e7534c8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -43,6 +43,7 @@
 #include <linux/vmalloc.h>
 #include <linux/irq.h>
 
+#include <net/ip.h>
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ip6_checksum.h>
 #endif
-- 
2.19.1.568.g152ad8e336-goog

^ permalink raw reply related

* KMSAN: uninit-value in tipc_nl_compat_bearer_enable
From: syzbot @ 2018-10-30 16:18 UTC (permalink / raw)
  To: davem, jon.maloy, linux-kernel, netdev, syzkaller-bugs,
	tipc-discussion, ying.xue

Hello,

syzbot found the following crash on:

HEAD commit:    4bb25354f0b0 kmsan: unpoison pt_regs in do_nmi()
git tree:       https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=107556e5400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=36c582b1a617b1e6
dashboard link: https://syzkaller.appspot.com/bug?extid=b33d5cae0efd35dbfe77
compiler:       clang version 8.0.0 (trunk 339414)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17c2e95b400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=129a919d400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b33d5cae0efd35dbfe77@syzkaller.appspotmail.com

==================================================================
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:484
CPU: 1 PID: 6371 Comm: syz-executor652 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x306/0x460 lib/dump_stack.c:113
  kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
  __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
  strlen+0x3b/0xa0 lib/string.c:484
  nla_put_string include/net/netlink.h:1011 [inline]
  tipc_nl_compat_bearer_enable+0x238/0x7b0 net/tipc/netlink_compat.c:389
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
  tipc_nl_compat_doit+0x39f/0xae0 net/tipc/netlink_compat.c:344
  tipc_nl_compat_recv+0x147c/0x2760 net/tipc/netlink_compat.c:1107
  genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
  genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
  netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
  genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
  netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
  netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffef7beee8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00
R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
  kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
  kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
  kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
  slab_post_alloc_hook mm/slab.h:446 [inline]
  slab_alloc_node mm/slub.c:2727 [inline]
  __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
  __kmalloc_reserve net/core/skbuff.c:138 [inline]
  __alloc_skb+0x422/0xe90 net/core/skbuff.c:206
  alloc_skb include/linux/skbuff.h:996 [inline]
  netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
  netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* KMSAN: uninit-value in tipc_nl_compat_name_table_dump
From: syzbot @ 2018-10-30 16:18 UTC (permalink / raw)
  To: davem, jon.maloy, linux-kernel, netdev, syzkaller-bugs,
	tipc-discussion, ying.xue

Hello,

syzbot found the following crash on:

HEAD commit:    4bb25354f0b0 kmsan: unpoison pt_regs in do_nmi()
git tree:       https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=1708369d400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=36c582b1a617b1e6
dashboard link: https://syzkaller.appspot.com/bug?extid=06e771a754829716a327
compiler:       clang version 8.0.0 (trunk 339414)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11e4c523400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14ec1bf9400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+06e771a754829716a327@syzkaller.appspotmail.com

==================================================================
BUG: KMSAN: uninit-value in __arch_swab32  
arch/x86/include/uapi/asm/swab.h:10 [inline]
BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x4a8/0xba0  
net/tipc/netlink_compat.c:826
CPU: 0 PID: 6290 Comm: syz-executor848 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x306/0x460 lib/dump_stack.c:113
  kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
  __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
  __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
  __fswab32 include/uapi/linux/swab.h:59 [inline]
  tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826
  __tipc_nl_compat_dumpit+0x59e/0xdb0 net/tipc/netlink_compat.c:205
  tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:270
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1151 [inline]
  tipc_nl_compat_recv+0x1402/0x2760 net/tipc/netlink_compat.c:1210
  genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
  genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
  netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
  genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
  netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
  netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffecec49318 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00
R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
  kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
  kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
  kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
  slab_post_alloc_hook mm/slab.h:446 [inline]
  slab_alloc_node mm/slub.c:2727 [inline]
  __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
  __kmalloc_reserve net/core/skbuff.c:138 [inline]
  __alloc_skb+0x422/0xe90 net/core/skbuff.c:206
  alloc_skb include/linux/skbuff.h:996 [inline]
  netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
  netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x307/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: Latest net-next kernel 4.19.0+
From: Eric Dumazet @ 2018-10-30  7:29 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Cong Wang, Paweł Staszewski, Linux Kernel Network Developers
In-Reply-To: <CAG76SjY7fnFdgamBELATyO8NGnyNFYiX33SgLE6-q=eoBM8jKg@mail.gmail.com>

On 10/29/2018 11:09 PM, Dimitris Michailidis wrote:

> 
> Indeed this is a bug. I would expect it to produce frequent errors
> though as many odd-length
> packets would trigger it. Do you have RXFCS? Regardless, how
> frequently do you see the problem?
> 

Old kernels (before 88078d98d1bb) were simply resetting ip_summed to CHECKSUM_NONE

And before your fix (commit d55bef5059dd057bd), mlx5 bug was canceling the bug you fixed.

So we now need to also fix mlx5.

And of course use skb_header_pointer() in mlx5e_get_fcs() as I mentioned earlier,
plus __get_unaligned_cpu32() as you hinted.

^ permalink raw reply

* [PATCH net] net/mlx5e: fix csum adjustments caused by RXFCS
From: Eric Dumazet @ 2018-10-30  7:57 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Eran Ben Elisha,
	Saeed Mahameed, Dimitris Michailidis, Cong Wang,
	Paweł Staszewski

As shown by Dmitris, we need to use csum_block_add() instead of csum_add()
when adding the FCS contribution to skb csum.

Before 4.18 (more exactly commit 88078d98d1bb "net: pskb_trim_rcsum()
and CHECKSUM_COMPLETE are friends"), the whole skb csum was thrown away,
so RXFCS changes were ignored.

Then before commit d55bef5059dd ("net: fix pskb_trim_rcsum_slow() with
odd trim offset") both mlx5 and pskb_trim_rcsum_slow() bugs were canceling
each other.

Now we fixed pskb_trim_rcsum_slow() we need to fix mlx5.

Note that this patch also rewrites mlx5e_get_fcs() to :

- Use skb_header_pointer() instead of reinventing it.
- Use __get_unaligned_cpu32() to avoid possible non aligned accesses
  as Dmitris pointed out.

Fixes: 902a545904c7 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation")
Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eran Ben Elisha <eranbe@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Dimitris Michailidis <dmichail@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Paweł Staszewski <pstaszewski@itcare.pl>
---
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 45 ++++---------------
 1 file changed, 9 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 94224c22ecc310a87b6715051e335446f29bec03..79638dcbae78395fb723c9bf3fa877e7a42d91cd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -713,43 +713,15 @@ static inline void mlx5e_enable_ecn(struct mlx5e_rq *rq, struct sk_buff *skb)
 	rq->stats->ecn_mark += !!rc;
 }
 
-static __be32 mlx5e_get_fcs(struct sk_buff *skb)
+static u32 mlx5e_get_fcs(const struct sk_buff *skb)
 {
-	int last_frag_sz, bytes_in_prev, nr_frags;
-	u8 *fcs_p1, *fcs_p2;
-	skb_frag_t *last_frag;
-	__be32 fcs_bytes;
+	const void *fcs_bytes;
+	u32 _fcs_bytes;
 
-	if (!skb_is_nonlinear(skb))
-		return *(__be32 *)(skb->data + skb->len - ETH_FCS_LEN);
+	fcs_bytes = skb_header_pointer(skb, skb->len - ETH_FCS_LEN,
+				       ETH_FCS_LEN, &_fcs_bytes);
 
-	nr_frags = skb_shinfo(skb)->nr_frags;
-	last_frag = &skb_shinfo(skb)->frags[nr_frags - 1];
-	last_frag_sz = skb_frag_size(last_frag);
-
-	/* If all FCS data is in last frag */
-	if (last_frag_sz >= ETH_FCS_LEN)
-		return *(__be32 *)(skb_frag_address(last_frag) +
-				   last_frag_sz - ETH_FCS_LEN);
-
-	fcs_p2 = (u8 *)skb_frag_address(last_frag);
-	bytes_in_prev = ETH_FCS_LEN - last_frag_sz;
-
-	/* Find where the other part of the FCS is - Linear or another frag */
-	if (nr_frags == 1) {
-		fcs_p1 = skb_tail_pointer(skb);
-	} else {
-		skb_frag_t *prev_frag = &skb_shinfo(skb)->frags[nr_frags - 2];
-
-		fcs_p1 = skb_frag_address(prev_frag) +
-			    skb_frag_size(prev_frag);
-	}
-	fcs_p1 -= bytes_in_prev;
-
-	memcpy(&fcs_bytes, fcs_p1, bytes_in_prev);
-	memcpy(((u8 *)&fcs_bytes) + bytes_in_prev, fcs_p2, last_frag_sz);
-
-	return fcs_bytes;
+	return __get_unaligned_cpu32(fcs_bytes);
 }
 
 static u8 get_ip_proto(struct sk_buff *skb, __be16 proto)
@@ -797,8 +769,9 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 						 network_depth - ETH_HLEN,
 						 skb->csum);
 		if (unlikely(netdev->features & NETIF_F_RXFCS))
-			skb->csum = csum_add(skb->csum,
-					     (__force __wsum)mlx5e_get_fcs(skb));
+			skb->csum = csum_block_add(skb->csum,
+						   (__force __wsum)mlx5e_get_fcs(skb),
+						   skb->len - ETH_FCS_LEN);
 		stats->csum_complete++;
 		return;
 	}
-- 
2.19.1.568.g152ad8e336-goog

^ permalink raw reply related

* Re: Latest net-next kernel 4.19.0+
From: Paweł Staszewski @ 2018-10-30  8:09 UTC (permalink / raw)
  To: Eric Dumazet, Dimitris Michailidis
  Cc: Cong Wang, Linux Kernel Network Developers
In-Reply-To: <68f25a28-b79e-d3ae-6eef-50c354ad63ae@gmail.com>



W dniu 30.10.2018 o 08:29, Eric Dumazet pisze:
>
> On 10/29/2018 11:09 PM, Dimitris Michailidis wrote:
>
>> Indeed this is a bug. I would expect it to produce frequent errors
>> though as many odd-length
>> packets would trigger it. Do you have RXFCS? Regardless, how
>> frequently do you see the problem?
>>
> Old kernels (before 88078d98d1bb) were simply resetting ip_summed to CHECKSUM_NONE
>
> And before your fix (commit d55bef5059dd057bd), mlx5 bug was canceling the bug you fixed.
>
> So we now need to also fix mlx5.
>
> And of course use skb_header_pointer() in mlx5e_get_fcs() as I mentioned earlier,
> plus __get_unaligned_cpu32() as you hinted.
>
>
>
>

No RXFCS

And this trace is rly frequently like once per 3/4 seconds
like below:
[28965.776864] vlan1490: hw csum failure
[28965.776867] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28965.776868] Call Trace:
[28965.776870]  <IRQ>
[28965.776876]  dump_stack+0x46/0x5b
[28965.776879]  __skb_checksum_complete+0x9a/0xa0
[28965.776882]  tcp_v4_rcv+0xef/0x960
[28965.776884]  ip_local_deliver_finish+0x49/0xd0
[28965.776886]  ip_local_deliver+0x5e/0xe0
[28965.776888]  ? ip_sublist_rcv_finish+0x50/0x50
[28965.776889]  ip_rcv+0x41/0xc0
[28965.776891]  __netif_receive_skb_one_core+0x4b/0x70
[28965.776893]  netif_receive_skb_internal+0x2f/0xd0
[28965.776894]  napi_gro_receive+0xb7/0xe0
[28965.776897]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28965.776899]  mlx5e_poll_rx_cq+0xc6/0x930
[28965.776900]  mlx5e_napi_poll+0xab/0xc90
[28965.776904]  ? kmem_cache_free_bulk+0x1e4/0x280
[28965.776905]  net_rx_action+0x1f1/0x320
[28965.776909]  __do_softirq+0xec/0x2b7
[28965.776912]  irq_exit+0x7b/0x80
[28965.776913]  do_IRQ+0x45/0xc0
[28965.776915]  common_interrupt+0xf/0xf
[28965.776916]  </IRQ>
[28965.776918] RIP: 0010:mwait_idle+0x5f/0x1b0
[28965.776919] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28965.776920] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28965.776921] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28965.776922] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28965.776922] RBP: 0000000000000000 R08: 00000000000000aa R09: 
ffff88046f81fbc0
[28965.776923] R10: 0000000000000000 R11: 00000001006d5985 R12: 
ffffffff8220f780
[28965.776924] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28965.776927]  do_idle+0x1a3/0x1c0
[28965.776929]  cpu_startup_entry+0x14/0x20
[28965.776932]  start_kernel+0x488/0x4a8
[28965.776935]  secondary_startup_64+0xa4/0xb0
[28965.981529] vlan1490: hw csum failure
[28965.981531] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28965.981532] Call Trace:
[28965.981534]  <IRQ>
[28965.981539]  dump_stack+0x46/0x5b
[28965.981543]  __skb_checksum_complete+0x9a/0xa0
[28965.981545]  tcp_v4_rcv+0xef/0x960
[28965.981548]  ip_local_deliver_finish+0x49/0xd0
[28965.981550]  ip_local_deliver+0x5e/0xe0
[28965.981551]  ? ip_sublist_rcv_finish+0x50/0x50
[28965.981552]  ip_rcv+0x41/0xc0
[28965.981555]  __netif_receive_skb_one_core+0x4b/0x70
[28965.981556]  netif_receive_skb_internal+0x2f/0xd0
[28965.981558]  napi_gro_receive+0xb7/0xe0
[28965.981560]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28965.981562]  mlx5e_poll_rx_cq+0xc6/0x930
[28965.981563]  mlx5e_napi_poll+0xab/0xc90
[28965.981567]  ? kmem_cache_free_bulk+0x1e4/0x280
[28965.981568]  net_rx_action+0x1f1/0x320
[28965.981571]  __do_softirq+0xec/0x2b7
[28965.981575]  irq_exit+0x7b/0x80
[28965.981576]  do_IRQ+0x45/0xc0
[28965.981578]  common_interrupt+0xf/0xf
[28965.981579]  </IRQ>
[28965.981580] RIP: 0010:mwait_idle+0x5f/0x1b0
[28965.981582] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28965.981583] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28965.981584] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28965.981585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28965.981586] RBP: 0000000000000000 R08: 0000000000000383 R09: 
ffff88046f81fbc0
[28965.981586] R10: 0000000000000000 R11: 00000001006d59b8 R12: 
ffffffff8220f780
[28965.981587] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28965.981591]  do_idle+0x1a3/0x1c0
[28965.981592]  cpu_startup_entry+0x14/0x20
[28965.981596]  start_kernel+0x488/0x4a8
[28965.981600]  secondary_startup_64+0xa4/0xb0
[28966.511782] vlan1490: hw csum failure
[28966.511785] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28966.511785] Call Trace:
[28966.511787]  <IRQ>
[28966.511793]  dump_stack+0x46/0x5b
[28966.511797]  __skb_checksum_complete+0x9a/0xa0
[28966.511799]  tcp_v4_rcv+0xef/0x960
[28966.511802]  ip_local_deliver_finish+0x49/0xd0
[28966.511804]  ip_local_deliver+0x5e/0xe0
[28966.511806]  ? ip_sublist_rcv_finish+0x50/0x50
[28966.511807]  ip_rcv+0x41/0xc0
[28966.511810]  __netif_receive_skb_one_core+0x4b/0x70
[28966.511812]  netif_receive_skb_internal+0x2f/0xd0
[28966.511814]  napi_gro_receive+0xb7/0xe0
[28966.511817]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28966.511819]  mlx5e_poll_rx_cq+0xc6/0x930
[28966.511821]  mlx5e_napi_poll+0xab/0xc90
[28966.511824]  ? kmem_cache_free_bulk+0x1e4/0x280
[28966.511826]  net_rx_action+0x1f1/0x320
[28966.511830]  __do_softirq+0xec/0x2b7
[28966.511834]  irq_exit+0x7b/0x80
[28966.511835]  do_IRQ+0x45/0xc0
[28966.511837]  common_interrupt+0xf/0xf
[28966.511838]  </IRQ>
[28966.511839] RIP: 0010:mwait_idle+0x5f/0x1b0
[28966.511841] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28966.511841] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28966.511842] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28966.511843] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28966.511844] RBP: 0000000000000000 R08: 000000000000011f R09: 
ffff88046f81fbc0
[28966.511844] R10: 0000000000000000 R11: 00000001006d5a3d R12: 
ffffffff8220f780
[28966.511845] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28966.511848]  do_idle+0x1a3/0x1c0
[28966.511850]  cpu_startup_entry+0x14/0x20
[28966.511853]  start_kernel+0x488/0x4a8
[28966.511857]  secondary_startup_64+0xa4/0xb0
[28967.271020] vlan1490: hw csum failure
[28967.271023] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28967.271024] Call Trace:
[28967.271025]  <IRQ>
[28967.271032]  dump_stack+0x46/0x5b
[28967.271035]  __skb_checksum_complete+0x9a/0xa0
[28967.271038]  tcp_v4_rcv+0xef/0x960
[28967.271040]  ip_local_deliver_finish+0x49/0xd0
[28967.271042]  ip_local_deliver+0x5e/0xe0
[28967.271044]  ? ip_sublist_rcv_finish+0x50/0x50
[28967.271045]  ip_rcv+0x41/0xc0
[28967.271047]  __netif_receive_skb_one_core+0x4b/0x70
[28967.271049]  netif_receive_skb_internal+0x2f/0xd0
[28967.271051]  napi_gro_receive+0xb7/0xe0
[28967.271054]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28967.271055]  mlx5e_poll_rx_cq+0xc6/0x930
[28967.271057]  mlx5e_napi_poll+0xab/0xc90
[28967.271060]  ? kmem_cache_free_bulk+0x1e4/0x280
[28967.271062]  net_rx_action+0x1f1/0x320
[28967.271065]  __do_softirq+0xec/0x2b7
[28967.271069]  irq_exit+0x7b/0x80
[28967.271071]  do_IRQ+0x45/0xc0
[28967.271072]  common_interrupt+0xf/0xf
[28967.271073]  </IRQ>
[28967.271075] RIP: 0010:mwait_idle+0x5f/0x1b0
[28967.271077] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28967.271078] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28967.271079] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28967.271080] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28967.271081] RBP: 0000000000000000 R08: 00000000000002d1 R09: 
ffff88046f81fbc0
[28967.271082] R10: 0000000000000000 R11: 00000001006d5afa R12: 
ffffffff8220f780
[28967.271082] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28967.271086]  do_idle+0x1a3/0x1c0
[28967.271088]  cpu_startup_entry+0x14/0x20
[28967.271091]  start_kernel+0x488/0x4a8
[28967.271094]  secondary_startup_64+0xa4/0xb0
[28967.477135] vlan1490: hw csum failure
[28967.477138] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28967.477139] Call Trace:
[28967.477141]  <IRQ>
[28967.477148]  dump_stack+0x46/0x5b
[28967.477152]  __skb_checksum_complete+0x9a/0xa0
[28967.477154]  tcp_v4_rcv+0xef/0x960
[28967.477157]  ip_local_deliver_finish+0x49/0xd0
[28967.477159]  ip_local_deliver+0x5e/0xe0
[28967.477161]  ? ip_sublist_rcv_finish+0x50/0x50
[28967.477162]  ip_rcv+0x41/0xc0
[28967.477165]  __netif_receive_skb_one_core+0x4b/0x70
[28967.477167]  netif_receive_skb_internal+0x2f/0xd0
[28967.477169]  napi_gro_receive+0xb7/0xe0
[28967.477172]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28967.477174]  mlx5e_poll_rx_cq+0xc6/0x930
[28967.477175]  mlx5e_napi_poll+0xab/0xc90
[28967.477179]  ? kmem_cache_free_bulk+0x1e4/0x280
[28967.477181]  net_rx_action+0x1f1/0x320
[28967.477185]  __do_softirq+0xec/0x2b7
[28967.477190]  irq_exit+0x7b/0x80
[28967.477192]  do_IRQ+0x45/0xc0
[28967.477194]  common_interrupt+0xf/0xf
[28967.477195]  </IRQ>
[28967.477197] RIP: 0010:mwait_idle+0x5f/0x1b0
[28967.477199] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28967.477200] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28967.477202] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28967.477202] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28967.477203] RBP: 0000000000000000 R08: 0000000000000395 R09: 
000000000000ba13
[28967.477204] R10: 0000000000000000 R11: 00000001006d5b2e R12: 
ffffffff8220f780
[28967.477204] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28967.477208]  do_idle+0x1a3/0x1c0
[28967.477209]  cpu_startup_entry+0x14/0x20
[28967.477213]  start_kernel+0x488/0x4a8
[28967.477216]  secondary_startup_64+0xa4/0xb0
[28967.682124] vlan1490: hw csum failure
[28967.682127] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28967.682127] Call Trace:
[28967.682129]  <IRQ>
[28967.682135]  dump_stack+0x46/0x5b
[28967.682138]  __skb_checksum_complete+0x9a/0xa0
[28967.682141]  tcp_v4_rcv+0xef/0x960
[28967.682143]  ip_local_deliver_finish+0x49/0xd0
[28967.682145]  ip_local_deliver+0x5e/0xe0
[28967.682146]  ? ip_sublist_rcv_finish+0x50/0x50
[28967.682147]  ip_rcv+0x41/0xc0
[28967.682150]  __netif_receive_skb_one_core+0x4b/0x70
[28967.682151]  netif_receive_skb_internal+0x2f/0xd0
[28967.682153]  napi_gro_receive+0xb7/0xe0
[28967.682156]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28967.682157]  mlx5e_poll_rx_cq+0xc6/0x930
[28967.682159]  mlx5e_napi_poll+0xab/0xc90
[28967.682162]  ? kmem_cache_free_bulk+0x1e4/0x280
[28967.682164]  net_rx_action+0x1f1/0x320
[28967.682167]  __do_softirq+0xec/0x2b7
[28967.682171]  irq_exit+0x7b/0x80
[28967.682172]  do_IRQ+0x45/0xc0
[28967.682173]  common_interrupt+0xf/0xf
[28967.682175]  </IRQ>
[28967.682176] RIP: 0010:mwait_idle+0x5f/0x1b0
[28967.682177] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28967.682178] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28967.682179] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28967.682180] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28967.682180] RBP: 0000000000000000 R08: 000000000000002a R09: 
ffff88046f81fbc0
[28967.682181] R10: 0000000000000000 R11: 00000001006d5b61 R12: 
ffffffff8220f780
[28967.682181] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28967.682185]  do_idle+0x1a3/0x1c0
[28967.682186]  cpu_startup_entry+0x14/0x20
[28967.682189]  start_kernel+0x488/0x4a8
[28967.682192]  secondary_startup_64+0xa4/0xb0
[28968.112281] vlan1490: hw csum failure
[28968.112284] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28968.112285] Call Trace:
[28968.112287]  <IRQ>
[28968.112294]  dump_stack+0x46/0x5b
[28968.112297]  __skb_checksum_complete+0x9a/0xa0
[28968.112300]  tcp_v4_rcv+0xef/0x960
[28968.112303]  ip_local_deliver_finish+0x49/0xd0
[28968.112305]  ip_local_deliver+0x5e/0xe0
[28968.112307]  ? ip_sublist_rcv_finish+0x50/0x50
[28968.112308]  ip_rcv+0x41/0xc0
[28968.112311]  __netif_receive_skb_one_core+0x4b/0x70
[28968.112313]  netif_receive_skb_internal+0x2f/0xd0
[28968.112315]  napi_gro_receive+0xb7/0xe0
[28968.112318]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28968.112320]  mlx5e_poll_rx_cq+0xc6/0x930
[28968.112322]  mlx5e_napi_poll+0xab/0xc90
[28968.112326]  ? kmem_cache_free_bulk+0x1e4/0x280
[28968.112327]  net_rx_action+0x1f1/0x320
[28968.112331]  __do_softirq+0xec/0x2b7
[28968.112335]  irq_exit+0x7b/0x80
[28968.112336]  do_IRQ+0x45/0xc0
[28968.112338]  common_interrupt+0xf/0xf
[28968.112339]  </IRQ>
[28968.112340] RIP: 0010:mwait_idle+0x5f/0x1b0
[28968.112341] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28968.112342] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28968.112343] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28968.112344] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28968.112344] RBP: 0000000000000000 R08: 000000000000030f R09: 
ffff88046f81fbc0
[28968.112345] R10: 0000000000000000 R11: 00000001006d5bcd R12: 
ffffffff8220f780
[28968.112345] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28968.112349]  do_idle+0x1a3/0x1c0
[28968.112350]  cpu_startup_entry+0x14/0x20
[28968.112354]  start_kernel+0x488/0x4a8
[28968.112357]  secondary_startup_64+0xa4/0xb0
[28968.316518] vlan1490: hw csum failure
[28968.316521] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28968.316522] Call Trace:
[28968.316523]  <IRQ>
[28968.316529]  dump_stack+0x46/0x5b
[28968.316534]  __skb_checksum_complete+0x9a/0xa0
[28968.316536]  tcp_v4_rcv+0xef/0x960
[28968.316539]  ip_local_deliver_finish+0x49/0xd0
[28968.316541]  ip_local_deliver+0x5e/0xe0
[28968.316543]  ? ip_sublist_rcv_finish+0x50/0x50
[28968.316544]  ip_rcv+0x41/0xc0
[28968.316547]  __netif_receive_skb_one_core+0x4b/0x70
[28968.316549]  netif_receive_skb_internal+0x2f/0xd0
[28968.316550]  napi_gro_receive+0xb7/0xe0
[28968.316554]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28968.316555]  mlx5e_poll_rx_cq+0xc6/0x930
[28968.316557]  mlx5e_napi_poll+0xab/0xc90
[28968.316561]  ? kmem_cache_free_bulk+0x1e4/0x280
[28968.316562]  net_rx_action+0x1f1/0x320
[28968.316566]  __do_softirq+0xec/0x2b7
[28968.316570]  irq_exit+0x7b/0x80
[28968.316571]  do_IRQ+0x45/0xc0
[28968.316573]  common_interrupt+0xf/0xf
[28968.316574]  </IRQ>
[28968.316576] RIP: 0010:mwait_idle+0x5f/0x1b0
[28968.316577] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28968.316578] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28968.316579] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28968.316580] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28968.316581] RBP: 0000000000000000 R08: 00000000000001bc R09: 
ffff88046f81fbc0
[28968.316581] R10: 0000000000000000 R11: 00000001006d5c00 R12: 
ffffffff8220f780
[28968.316582] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28968.316585]  do_idle+0x1a3/0x1c0
[28968.316587]  cpu_startup_entry+0x14/0x20
[28968.316590]  start_kernel+0x488/0x4a8
[28968.316594]  secondary_startup_64+0xa4/0xb0
[28968.521770] vlan1490: hw csum failure
[28968.521773] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28968.521774] Call Trace:
[28968.521776]  <IRQ>
[28968.521782]  dump_stack+0x46/0x5b
[28968.521786]  __skb_checksum_complete+0x9a/0xa0
[28968.521788]  tcp_v4_rcv+0xef/0x960
[28968.521791]  ip_local_deliver_finish+0x49/0xd0
[28968.521793]  ip_local_deliver+0x5e/0xe0
[28968.521795]  ? ip_sublist_rcv_finish+0x50/0x50
[28968.521796]  ip_rcv+0x41/0xc0
[28968.521799]  __netif_receive_skb_one_core+0x4b/0x70
[28968.521802]  netif_receive_skb_internal+0x2f/0xd0
[28968.521804]  napi_gro_receive+0xb7/0xe0
[28968.521807]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28968.521809]  mlx5e_poll_rx_cq+0xc6/0x930
[28968.521811]  mlx5e_napi_poll+0xab/0xc90
[28968.521816]  ? kmem_cache_free_bulk+0x1e4/0x280
[28968.521818]  net_rx_action+0x1f1/0x320
[28968.521821]  __do_softirq+0xec/0x2b7
[28968.521826]  irq_exit+0x7b/0x80
[28968.521827]  do_IRQ+0x45/0xc0
[28968.521830]  common_interrupt+0xf/0xf
[28968.521831]  </IRQ>
[28968.521832] RIP: 0010:mwait_idle+0x5f/0x1b0
[28968.521835] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28968.521835] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28968.521836] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28968.521837] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28968.521838] RBP: 0000000000000000 R08: 0000000000000288 R09: 
ffff88046f81fbc0
[28968.521838] R10: 0000000000000000 R11: 00000001006d5c33 R12: 
ffffffff8220f780
[28968.521839] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28968.521842]  do_idle+0x1a3/0x1c0
[28968.521844]  cpu_startup_entry+0x14/0x20
[28968.521847]  start_kernel+0x488/0x4a8
[28968.521850]  secondary_startup_64+0xa4/0xb0
[28968.726877] vlan1490: hw csum failure
[28968.726880] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28968.726881] Call Trace:
[28968.726882]  <IRQ>
[28968.726888]  dump_stack+0x46/0x5b
[28968.726892]  __skb_checksum_complete+0x9a/0xa0
[28968.726894]  tcp_v4_rcv+0xef/0x960
[28968.726897]  ip_local_deliver_finish+0x49/0xd0
[28968.726898]  ip_local_deliver+0x5e/0xe0
[28968.726900]  ? ip_sublist_rcv_finish+0x50/0x50
[28968.726901]  ip_rcv+0x41/0xc0
[28968.726904]  __netif_receive_skb_one_core+0x4b/0x70
[28968.726905]  netif_receive_skb_internal+0x2f/0xd0
[28968.726907]  napi_gro_receive+0xb7/0xe0
[28968.726909]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28968.726911]  mlx5e_poll_rx_cq+0xc6/0x930
[28968.726913]  mlx5e_napi_poll+0xab/0xc90
[28968.726916]  ? kmem_cache_free_bulk+0x1e4/0x280
[28968.726918]  net_rx_action+0x1f1/0x320
[28968.726921]  __do_softirq+0xec/0x2b7
[28968.726925]  irq_exit+0x7b/0x80
[28968.726926]  do_IRQ+0x45/0xc0
[28968.726927]  common_interrupt+0xf/0xf
[28968.726928]  </IRQ>
[28968.726930] RIP: 0010:mwait_idle+0x5f/0x1b0
[28968.726931] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28968.726932] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28968.726932] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28968.726933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28968.726934] RBP: 0000000000000000 R08: 0000000000000092 R09: 
ffff88046f81fbc0
[28968.726934] R10: 0000000000000000 R11: 00000001006d5c66 R12: 
ffffffff8220f780
[28968.726935] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28968.726938]  do_idle+0x1a3/0x1c0
[28968.726939]  cpu_startup_entry+0x14/0x20
[28968.726942]  start_kernel+0x488/0x4a8
[28968.726946]  secondary_startup_64+0xa4/0xb0
[28969.015326] vlan1490: hw csum failure
[28969.015329] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28969.015330] Call Trace:
[28969.015331]  <IRQ>
[28969.015337]  dump_stack+0x46/0x5b
[28969.015341]  __skb_checksum_complete+0x9a/0xa0
[28969.015344]  tcp_v4_rcv+0xef/0x960
[28969.015347]  ip_local_deliver_finish+0x49/0xd0
[28969.015349]  ip_local_deliver+0x5e/0xe0
[28969.015351]  ? ip_sublist_rcv_finish+0x50/0x50
[28969.015352]  ip_rcv+0x41/0xc0
[28969.015355]  __netif_receive_skb_one_core+0x4b/0x70
[28969.015357]  netif_receive_skb_internal+0x2f/0xd0
[28969.015359]  napi_gro_receive+0xb7/0xe0
[28969.015362]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28969.015364]  mlx5e_poll_rx_cq+0xc6/0x930
[28969.015365]  mlx5e_napi_poll+0xab/0xc90
[28969.015370]  ? kmem_cache_free_bulk+0x1e4/0x280
[28969.015371]  net_rx_action+0x1f1/0x320
[28969.015375]  __do_softirq+0xec/0x2b7
[28969.015379]  irq_exit+0x7b/0x80
[28969.015380]  do_IRQ+0x45/0xc0
[28969.015382]  common_interrupt+0xf/0xf
[28969.015383]  </IRQ>
[28969.015384] RIP: 0010:mwait_idle+0x5f/0x1b0
[28969.015385] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28969.015386] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28969.015387] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28969.015388] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28969.015388] RBP: 0000000000000000 R08: 000000000000033a R09: 
ffff88046f81fbc0
[28969.015389] R10: 0000000000000000 R11: 00000001006d5cae R12: 
ffffffff8220f780
[28969.015389] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28969.015392]  do_idle+0x1a3/0x1c0
[28969.015394]  cpu_startup_entry+0x14/0x20
[28969.015397]  start_kernel+0x488/0x4a8
[28969.015401]  secondary_startup_64+0xa4/0xb0
[28976.679233] vlan1490: hw csum failure
[28976.679236] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0+ #1
[28976.679237] Call Trace:
[28976.679239]  <IRQ>
[28976.679245]  dump_stack+0x46/0x5b
[28976.679249]  __skb_checksum_complete+0x9a/0xa0
[28976.679251]  tcp_v4_rcv+0xef/0x960
[28976.679254]  ip_local_deliver_finish+0x49/0xd0
[28976.679256]  ip_local_deliver+0x5e/0xe0
[28976.679258]  ? ip_sublist_rcv_finish+0x50/0x50
[28976.679259]  ip_rcv+0x41/0xc0
[28976.679262]  __netif_receive_skb_one_core+0x4b/0x70
[28976.679263]  netif_receive_skb_internal+0x2f/0xd0
[28976.679265]  napi_gro_receive+0xb7/0xe0
[28976.679267]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28976.679269]  mlx5e_poll_rx_cq+0xc6/0x930
[28976.679271]  mlx5e_napi_poll+0xab/0xc90
[28976.679274]  ? kmem_cache_free_bulk+0x1e4/0x280
[28976.679276]  net_rx_action+0x1f1/0x320
[28976.679279]  __do_softirq+0xec/0x2b7
[28976.679282]  irq_exit+0x7b/0x80
[28976.679284]  do_IRQ+0x45/0xc0
[28976.679285]  common_interrupt+0xf/0xf
[28976.679286]  </IRQ>
[28976.679287] RIP: 0010:mwait_idle+0x5f/0x1b0
[28976.679289] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28976.679290] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd3
[28976.679292] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[28976.679292] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28976.679293] RBP: 0000000000000000 R08: 00000000000001dc R09: 
ffff88046f81fbc0
[28976.679293] R10: 0000000000000000 R11: 00000001006d642a R12: 
ffffffff8220f780
[28976.679294] R13: ffffffff8220f780 R14: 0000000000000000 R15: 
0000000000000000
[28976.679297]  do_idle+0x1a3/0x1c0
[28976.679299]  cpu_startup_entry+0x14/0x20
[28976.679302]  start_kernel+0x488/0x4a8
[28976.679305]  secondary_startup_64+0xa4/0xb0
[28982.432790] vlan2566: hw csum failure
[28982.432794] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 4.19.0+ #1
[28982.432795] Call Trace:
[28982.432796]  <IRQ>
[28982.432803]  dump_stack+0x46/0x5b
[28982.432807]  __skb_checksum_complete+0x9a/0xa0
[28982.432810]  tcp_v4_rcv+0xef/0x960
[28982.432813]  ip_local_deliver_finish+0x49/0xd0
[28982.432814]  ip_local_deliver+0x5e/0xe0
[28982.432816]  ? ip_sublist_rcv_finish+0x50/0x50
[28982.432818]  ip_rcv+0x41/0xc0
[28982.432821]  __netif_receive_skb_one_core+0x4b/0x70
[28982.432822]  netif_receive_skb_internal+0x2f/0xd0
[28982.432824]  napi_gro_receive+0xb7/0xe0
[28982.432827]  mlx5e_handle_rx_cqe+0x7a/0xd0
[28982.432829]  mlx5e_poll_rx_cq+0xc6/0x930
[28982.432830]  mlx5e_napi_poll+0xab/0xc90
[28982.432834]  ? kmem_cache_free_bulk+0x1e4/0x280
[28982.432836]  net_rx_action+0x1f1/0x320
[28982.432839]  __do_softirq+0xec/0x2b7
[28982.432844]  irq_exit+0x7b/0x80
[28982.432845]  do_IRQ+0x45/0xc0
[28982.432847]  common_interrupt+0xf/0xf
[28982.432848]  </IRQ>
[28982.432849] RIP: 0010:mwait_idle+0x5f/0x1b0
[28982.432850] Code: a8 01 0f 85 3f 01 00 00 31 d2 65 48 8b 04 25 80 4c 
01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 0f 85 40 01 00 00 31 c0 fb 0f 01 
c9 <65> 8b 2d 2a c9 6a 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0
[28982.432851] RSP: 0018:ffffc900033a7eb8 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdb
[28982.432852] RAX: 0000000000000000 RBX: 000000000000000c RCX: 
0000000000000000
[28982.432852] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[28982.432853] RBP: 000000000000000c R08: 0000000000000197 R09: 
ffff88046fb1fbc0
[28982.432853] R10: 0000000000000000 R11: 00000001006d69c9 R12: 
ffff88046d1f62c0
[28982.432854] R13: ffff88046d1f62c0 R14: 0000000000000000 R15: 
0000000000000000
[28982.432857]  do_idle+0x1a3/0x1c0
[28982.432859]  cpu_startup_entry+0x14/0x20
[28982.432861]  start_secondary+0x165/0x190
[28982.432864]  secondary_startup_64+0xa4/0xb0

^ permalink raw reply

* Re: [Patch V4 net 02/11] net: hns3: add error handler for hns3_get_ring_config/hns3_queue_to_ring
From: Sergei Shtylyov @ 2018-10-30  9:09 UTC (permalink / raw)
  To: Huazhong Tan, davem
  Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540821261-55002-3-git-send-email-tanhuazhong@huawei.com>

Hello!

On 10/29/2018 4:54 PM, Huazhong Tan wrote:

> When hns3_get_ring_config()/hns3_queue_to_ring() failed during resetting,
> the allocated memory has not been freed before hns3_get_ring_config() and
> hns3_queue_to_ring() return. So this patch fixes the buffer not freeing
> problem during resetting.
> 
> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
> ---
>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> index d9066c5..6f0fd62 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
[...]
> @@ -3047,7 +3049,7 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
>   {
>   	struct hnae3_handle *h = priv->ae_handle;
>   	struct pci_dev *pdev = h->pdev;
> -	int i, ret;
> +	int i, j, ret;
>   
>   	priv->ring_data =  devm_kzalloc(&pdev->dev,
>   					array3_size(h->kinfo.num_tqps,
> @@ -3065,6 +3067,12 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
>   
>   	return 0;
>   err:
> +	for (j = i - 1; j >= 0; j--) {

    As is with the other patch, you don't need 'j' here.

> +		devm_kfree(priv->dev, priv->ring_data[j].ring);
> +		devm_kfree(priv->dev,
> +			   priv->ring_data[j + h->kinfo.num_tqps].ring);
> +	}
> +
>   	devm_kfree(&pdev->dev, priv->ring_data);
>   	return ret;
>   }

MBR, Sergei

^ permalink raw reply

* [bug report] PCI: Remove NULL device handling from PCI DMA API
From: Dan Carpenter @ 2018-10-30  9:10 UTC (permalink / raw)
  To: hch; +Cc: netdev, Bjorn Helgaas

Hello Christoph Hellwig,

The patch 4167b2ad5182: "PCI: Remove NULL device handling from PCI
DMA API" from Jan 10, 2018, leads to the following static checker
warning:

	drivers/net/ethernet/amd/pcnet32.c:1921 pcnet32_probe1()
	warn: variable dereferenced before check 'pdev' (see line 1843)

drivers/net/ethernet/amd/pcnet32.c
  1839  
  1840          dev->base_addr = ioaddr;
  1841          lp = netdev_priv(dev);
  1842          /* pci_alloc_consistent returns page-aligned memory, so we do not have to check the alignment */
  1843          lp->init_block = pci_alloc_consistent(pdev, sizeof(*lp->init_block),
                                                      ^^^^
This function is called with a NULL "pdev" when we're probing from
pcnet32_probe_vlbus().

  1844                                                &lp->init_dma_addr);
  1845          if (!lp->init_block) {
  1846                  if (pcnet32_debug & NETIF_MSG_PROBE)
  1847                          pr_err("Consistent memory allocation failed\n");
  1848                  ret = -ENOMEM;
  1849                  goto err_free_netdev;
  1850          }
  1851          lp->pci_dev = pdev;
  1852  
  1853          lp->dev = dev;
  1854  

regards,
dan carpenter

^ permalink raw reply

* Re: [Patch V4 net 01/11] net: hns3: add error handler for hns3_nic_init_vector_data()
From: Sergei Shtylyov @ 2018-10-30  9:11 UTC (permalink / raw)
  To: Huazhong Tan, davem
  Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540821261-55002-2-git-send-email-tanhuazhong@huawei.com>

On 10/29/2018 4:54 PM, Huazhong Tan wrote:

> When hns3_nic_init_vector_data() fails to map ring to vector,
> it should cancel the netif_napi_add() that has been successfully
> done and then exits.
> 
> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
> ---
>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> index 32f3aca8..d9066c5 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> @@ -2821,7 +2821,7 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
>   	struct hnae3_handle *h = priv->ae_handle;
>   	struct hns3_enet_tqp_vector *tqp_vector;
>   	int ret = 0;
> -	u16 i;
> +	int i, j;
>   
>   	hns3_nic_set_cpumask(priv);
>   
> @@ -2868,13 +2868,19 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
>   		hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
>   
>   		if (ret)
> -			return ret;
> +			goto map_ring_fail;
>   
>   		netif_napi_add(priv->netdev, &tqp_vector->napi,
>   			       hns3_nic_common_poll, NAPI_POLL_WEIGHT);
>   	}
>   
>   	return 0;
> +
> +map_ring_fail:
> +	for (j = i - 1; j >= 0; j--)
> +		netif_napi_del(&priv->tqp_vector[j].napi);

    'j' doesn't seem needed as well.

[...]

MBR, Sergei

^ permalink raw reply

* RE: [PATCH net-next v2 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
From: Justin.Lee1 @ 2018-10-30 18:23 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <b0fd357b1fc4b9aed6300019557dc8a391ecf52d.camel@mendozajonas.com>



> On Fri, 2018-10-26 at 17:25 +0000, Justin.Lee1@Dell.com wrote:
> > Hi Samuel,
> > 
> > I noticed a few issues and commented below.
> > 
> > Thanks,
> > Justin
> > 
> > 
> > >  /* Resources */
> > > +int ncsi_reset_dev(struct ncsi_dev *nd);
> > >  void ncsi_start_channel_monitor(struct ncsi_channel *nc);
> > >  void ncsi_stop_channel_monitor(struct ncsi_channel *nc);
> > >  struct ncsi_channel *ncsi_find_channel(struct ncsi_package *np,
> > > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > > index 014321ad31d3..9bad03e3fa5e 100644
> > > --- a/net/ncsi/ncsi-manage.c
> > > +++ b/net/ncsi/ncsi-manage.c
> > > @@ -550,8 +550,10 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
> > >  		spin_lock_irqsave(&nc->lock, flags);
> > >  		nc->state = NCSI_CHANNEL_INACTIVE;
> > >  		spin_unlock_irqrestore(&nc->lock, flags);
> > > -		ncsi_process_next_channel(ndp);
> > > -
> > > +		if (ndp->flags & NCSI_DEV_RESET)
> > > +			ncsi_reset_dev(nd);
> > > +		else
> > > +			ncsi_process_next_channel(ndp);
> > >  		break;
> > >  	default:
> > >  		netdev_warn(nd->dev, "Wrong NCSI state 0x%x in suspend\n",
> > > @@ -1554,7 +1556,7 @@ int ncsi_start_dev(struct ncsi_dev *nd)
> > >  		return 0;
> > >  	}
> > >  
> > > -	return ncsi_choose_active_channel(nd);
> > > +	return ncsi_reset_dev(nd);
> > 
> > If there is no available channel due to the whitelist, ncsi_start_dev() function will return failed
> > Status and the network interface may fail to bring up too. It is possible for user to disable all 
> > channels and leave the interface up for checking the LOM status.
> > 
> 
> I'm not sure that that is a bug, or at least not in the scope of this
> series. If the whitelist is set such that no channels are valid then
> there's nothing for NCSI to do. If we want to do something like always
> monitor all channels then that would be best to do in another patch.
> 
> > >  }
> > >  EXPORT_SYMBOL_GPL(ncsi_start_dev);
> > 
> > Also, if I send set_package_mask and set_channel_mask commands back to back in a program,
> > the state machine doesn't work well. If I use command line and wait for it to complete for 
> > each step, then it is fine.
> 
> Yeah that's not great; probably hitting some corner cases in the NCSI
> locking. I'll look into the multi-channel related stuff but I have a
> feeling that if you tried this with the existing set/clear commands you
> would probably hit something similar, especially on your dual core
> platform. If so this is probably something to fix separately.
> 

It is possible that it is causing by the following code in ncsi_reset_dev() function.
The state might be overwritten and the previous operation is interrupted.

	spin_lock_irqsave(&ndp->lock, flags);
	ndp->flags |= NCSI_DEV_RESET;
	ndp->active_channel = active;
	ndp->active_package = active->package;
	spin_unlock_irqrestore(&ndp->lock, flags);

	nd->state = ncsi_dev_state_suspend;

> > 
> > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-package enabled on ifindex 2, mask 0x00000001
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0400
> > npcm7xx-emc f0825000.eth eth2: NCSI: pkg 0 ch 0 set as preferred channel
> > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000003
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 1
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 1 state 0400
> > npcm7xx-emc f0825000.eth eth2: NCSI: Package 1 set to all channels disabled
> > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000000
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass pkg whitelist
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 1
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - next pkg
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 1
> > npcm7xx-emc f0825000.eth eth2: NCSI: No channel found to configure!
> > npcm7xx-emc f0825000.eth eth2: NCSI interface down
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: Wrong NCSI state 0x100 in workqueue
> > 
> > All masks are set correctly, but you can see the PS column is not right and channel doesn't
> > configure correctly.
> > 
> > /sys/kernel/debug/ncsi_protocol# cat ncsi_device_status
> > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC PS LS RU CR NQ HA
> > ===================================================================
> >   2   eth2   ncsi0  000 000 1  1  1  1  1  1  1  0  1  1  1  0  1
> >   2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  0  1  1  1  0  1
> >   2   eth2   ncsi2  001 000 0  0  1  1  0  0  0  0  1  1  1  0  1
> >   2   eth2   ncsi3  001 001 0  0  1  1  0  0  0  0  1  1  1  0  1
> > ===================================================================
> > MP: Multi-mode Package     WP: Whitelist Package
> > MC: Multi-mode Channel     WC: Whitelist Channel
> > PC: Primary Channel
> > PS: Poll Status
> > LS: Link Status
> > RU: Running
> > CR: Carrier OK
> > NQ: Queue Stopped
> > HA: Hardware Arbitration
> > 
> > PS column is getting from (int)nc->monitor.enabled.



^ permalink raw reply

* Re: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()
From: David Miller @ 2018-10-30 18:25 UTC (permalink / raw)
  To: kurt
  Cc: anirudh, John.Linn, michal.simek, radhey.shyam.pandey, andrew,
	yuehaibing, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20181030093139.10226-2-kurt@linutronix.de>

From: Kurt Kanzenbach <kurt@linutronix.de>
Date: Tue, 30 Oct 2018 10:31:38 +0100

> The function could report a false positive if it gets preempted between reading
> the XAE_MDIO_MCR_OFFSET register and checking for the timeout.  In such a case,
> the condition has to be rechecked to avoid false positives.
> 
> Therefore, check for expected condition even after the timeout occurred.
> 
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
 ...
>  		if (time_before_eq(end, jiffies)) {
> -			WARN_ON(1);
> -			return -ETIMEDOUT;
> +			val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
> +			break;
>  		}
> +
>  		udelay(1);
>  	}
> -	return 0;
> +	if (val & XAE_MDIO_MCR_READY_MASK)
> +		return 0;
> +
> +	WARN_ON(1);
> +	return -ETIMEDOUT;

You are not fundamentally changing the situation at all.

The condtion could change right after your last read of
XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your
modifications to this code.

It sounds more like the timeout is slightly too short, and that's the
real problem that causes whatever behavior you think you are fixing
here.

I'm not applying this.

^ permalink raw reply

* Re: [PATCH 2/2] net: xilinx_emaclite: recheck condition after timeout in mdio_wait()
From: David Miller @ 2018-10-30 18:25 UTC (permalink / raw)
  To: kurt
  Cc: anirudh, John.Linn, michal.simek, radhey.shyam.pandey, andrew,
	yuehaibing, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20181030093139.10226-3-kurt@linutronix.de>

From: Kurt Kanzenbach <kurt@linutronix.de>
Date: Tue, 30 Oct 2018 10:31:39 +0100

> The function could report a false positive if it gets preempted between reading
> the XEL_MDIOCTRL_OFFSET register and checking for the timeout.  In such a case,
> the condition has to be rechecked to avoid false positives.
> 
> Therefore, check for expected condition even after the timeout occurred.
> 
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>

Same objections as your previous patch.

This isn't fixing anything.

^ permalink raw reply

* Re: [PATCH v2] kselftests/bpf: use ping6 as the default ipv6 ping binary if it exists
From: Song Liu @ 2018-10-30 18:35 UTC (permalink / raw)
  To: lizhijian
  Cc: shuah, Networking, linux-kselftest, open list, Alexei Starovoitov,
	Daniel Borkmann
In-Reply-To: <1540869355-13324-1-git-send-email-lizhijian@cn.fujitsu.com>

On Mon, Oct 29, 2018 at 7:35 PM Li Zhijian <lizhijian@cn.fujitsu.com> wrote:
>
> ping binary on some distros doesn't support "ping -6" anymore.
>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>

I think this should go bpf-next. Please resubmit when the bpf-next tree is open
(after the merge window). Other than this:

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/testing/selftests/bpf/test_skb_cgroup_id.sh | 3 ++-
>  tools/testing/selftests/bpf/test_sock_addr.sh     | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_skb_cgroup_id.sh b/tools/testing/selftests/bpf/test_skb_cgroup_id.sh
> index 42544a9..a9bc6f8 100755
> --- a/tools/testing/selftests/bpf/test_skb_cgroup_id.sh
> +++ b/tools/testing/selftests/bpf/test_skb_cgroup_id.sh
> @@ -10,7 +10,7 @@ wait_for_ip()
>         echo -n "Wait for testing link-local IP to become available "
>         for _i in $(seq ${MAX_PING_TRIES}); do
>                 echo -n "."
> -               if ping -6 -q -c 1 -W 1 ff02::1%${TEST_IF} >/dev/null 2>&1; then
> +               if $PING6 -c 1 -W 1 ff02::1%${TEST_IF} >/dev/null 2>&1; then
>                         echo " OK"
>                         return
>                 fi
> @@ -58,5 +58,6 @@ BPF_PROG_OBJ="${DIR}/test_skb_cgroup_id_kern.o"
>  BPF_PROG_SECTION="cgroup_id_logger"
>  BPF_PROG_ID=0
>  PROG="${DIR}/test_skb_cgroup_id_user"
> +type ping6 >/dev/null 2>&1 && PING6="ping6" || PING6="ping -6"
>
>  main
> diff --git a/tools/testing/selftests/bpf/test_sock_addr.sh b/tools/testing/selftests/bpf/test_sock_addr.sh
> index 9832a87..3b9fdb8 100755
> --- a/tools/testing/selftests/bpf/test_sock_addr.sh
> +++ b/tools/testing/selftests/bpf/test_sock_addr.sh
> @@ -4,7 +4,8 @@ set -eu
>
>  ping_once()
>  {
> -       ping -${1} -q -c 1 -W 1 ${2%%/*} >/dev/null 2>&1
> +       type ping${1} >/dev/null 2>&1 && PING="ping${1}" || PING="ping -${1}"
> +       $PING -q -c 1 -W 1 ${2%%/*} >/dev/null 2>&1
>  }
>
>  wait_for_ip()
> --
> 2.7.4
>

^ permalink raw reply

* Re: [PATCH] net: mvpp2: Fix affinity hint allocation
From: David Miller @ 2018-10-30 18:40 UTC (permalink / raw)
  To: marc.zyngier
  Cc: antoine.tenart, netdev, linux-kernel, thomas.petazzoni,
	maxime.chevallier, miquel.raynal, gregory.clement, nadavh,
	stefanc, ymarkman, mw
In-Reply-To: <20181030154100.78512-1-marc.zyngier@arm.com>

From: Marc Zyngier <marc.zyngier@arm.com>
Date: Tue, 30 Oct 2018 15:41:00 +0000

> The mvpp2 driver has the curious behaviour of passing a stack variable
> to irq_set_affinity_hint(), which results in the kernel exploding
> the first time anyone accesses this information. News flash: userspace
> does, and irqbalance will happily take the machine down. Great stuff.
> 
> An easy fix is to track the mask within the queue_vector structure,
> and to make sure it has the same lifetime as the interrupt itself.
> 
> Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> As requested in https://lore.kernel.org/lkml/20181030135354.GD3407@kwain/

Applied.

^ permalink raw reply

* Re: [Patch V4 net 02/11] net: hns3: add error handler for hns3_get_ring_config/hns3_queue_to_ring
From: tanhuazhong @ 2018-10-30 10:17 UTC (permalink / raw)
  To: Sergei Shtylyov, davem
  Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <ddef9d62-ec9c-6df6-113a-e7b0a0bd2a05@cogentembedded.com>



On 2018/10/30 17:09, Sergei Shtylyov wrote:
> Hello!
> 
> On 10/29/2018 4:54 PM, Huazhong Tan wrote:
> 
>> When hns3_get_ring_config()/hns3_queue_to_ring() failed during resetting,
>> the allocated memory has not been freed before hns3_get_ring_config() and
>> hns3_queue_to_ring() return. So this patch fixes the buffer not freeing
>> problem during resetting.
>>
>> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver 
>> for hip08 SoC")
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
>> b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> index d9066c5..6f0fd62 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> [...]
>> @@ -3047,7 +3049,7 @@ static int hns3_get_ring_config(struct 
>> hns3_nic_priv *priv)
>>   {
>>       struct hnae3_handle *h = priv->ae_handle;
>>       struct pci_dev *pdev = h->pdev;
>> -    int i, ret;
>> +    int i, j, ret;
>>       priv->ring_data =  devm_kzalloc(&pdev->dev,
>>                       array3_size(h->kinfo.num_tqps,
>> @@ -3065,6 +3067,12 @@ static int hns3_get_ring_config(struct 
>> hns3_nic_priv *priv)
>>       return 0;
>>   err:
>> +    for (j = i - 1; j >= 0; j--) {
> 
>     As is with the other patch, you don't need 'j' here.
> 

Yes, i have modified it.
Thanks.

>> +        devm_kfree(priv->dev, priv->ring_data[j].ring);
>> +        devm_kfree(priv->dev,
>> +               priv->ring_data[j + h->kinfo.num_tqps].ring);
>> +    }
>> +
>>       devm_kfree(&pdev->dev, priv->ring_data);
>>       return ret;
>>   }
> 
> MBR, Sergei
> 

Greeting.
Huazhong.

> .
> 

^ permalink raw reply

* Re: [Patch V4 net 01/11] net: hns3: add error handler for hns3_nic_init_vector_data()
From: tanhuazhong @ 2018-10-30 10:19 UTC (permalink / raw)
  To: Sergei Shtylyov, davem
  Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <bca282be-8688-0e16-e785-986dac0e7ed8@cogentembedded.com>



On 2018/10/30 17:11, Sergei Shtylyov wrote:
> On 10/29/2018 4:54 PM, Huazhong Tan wrote:
> 
>> When hns3_nic_init_vector_data() fails to map ring to vector,
>> it should cancel the netif_napi_add() that has been successfully
>> done and then exits.
>>
>> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver 
>> for hip08 SoC")
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c 
>> b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> index 32f3aca8..d9066c5 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
>> @@ -2821,7 +2821,7 @@ static int hns3_nic_init_vector_data(struct 
>> hns3_nic_priv *priv)
>>       struct hnae3_handle *h = priv->ae_handle;
>>       struct hns3_enet_tqp_vector *tqp_vector;
>>       int ret = 0;
>> -    u16 i;
>> +    int i, j;
>>       hns3_nic_set_cpumask(priv);
>> @@ -2868,13 +2868,19 @@ static int hns3_nic_init_vector_data(struct 
>> hns3_nic_priv *priv)
>>           hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
>>           if (ret)
>> -            return ret;
>> +            goto map_ring_fail;
>>           netif_napi_add(priv->netdev, &tqp_vector->napi,
>>                      hns3_nic_common_poll, NAPI_POLL_WEIGHT);
>>       }
>>       return 0;
>> +
>> +map_ring_fail:
>> +    for (j = i - 1; j >= 0; j--)
>> +        netif_napi_del(&priv->tqp_vector[j].napi);
> 
>     'j' doesn't seem needed as well.
> 

yes, it will be change to below one.

+
+map_ring_fail:
+	while(i--)
+		netif_napi_del(&priv->tqp_vector[i].napi);
+
+	return ret;

> [...]
> 
> MBR, Sergei
> 
> .

Thanks, Huazhong.

> 

^ permalink raw reply

* Re: [PATCH v4.14-stable] sch_netem: restore skb->dev after dequeuing from the rbtree
From: Eduardo Valentin @ 2018-10-30 19:12 UTC (permalink / raw)
  To: David Miller, gregkh
  Cc: cpaasch, netdev, stable, stephen, luqia, edumazet, soheil, weiwan,
	willemb
In-Reply-To: <20181018.154348.2028947036934395230.davem@davemloft.net>

Greg,

On Thu, Oct 18, 2018 at 03:43:48PM -0700, David Miller wrote:
> From: Christoph Paasch <cpaasch@apple.com>
> Date: Thu, 18 Oct 2018 13:38:40 -0700
> 
> > Upstream commit bffa72cf7f9d ("net: sk_buff rbnode reorg") got
> > backported as commit 6b921536f170 ("net: sk_buff rbnode reorg") into the
> > v4.14.x-tree.
> > 
> > However, the backport does not include the changes in sch_netem.c
> > 
> > We need these, as otherwise the skb->dev pointer is not set when
> > dequeueing from the netem rbtree, resulting in a panic:
>  ...
> > Fixes: 6b921536f170 ("net: sk_buff rbnode reorg")
> > Cc: Stephen Hemminger <stephen@networkplumber.org>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Soheil Hassas Yeganeh <soheil@google.com>
> > Cc: Wei Wang <weiwan@google.com>
> > Cc: Willem de Bruijn <willemb@google.com>
> > Signed-off-by: Christoph Paasch <cpaasch@apple.com>
> > ---
> > 
> > Notes:
> >     This patch should only make it into v4.14-stable as that's the only branch where
> >     the offending commit has been backported to.
> 
> Greg, please queue up.

Are you planing to queue this one ?

Looks to me it was a miss on the backport.

It seams that the backport was touching different files, and missed the change
on net/sched/sch_netem.c. So, to me, even if this patch may not follow the
strictly the rules of stable, as it is not a patch in upstream, seams to be a 
needed change, even if it is specific to stable linux-4.14.y.

> 

-- 
All the best,
Eduardo Valentin

^ permalink raw reply

* WARNING in rds_message_alloc_sgs
From: syzbot @ 2018-10-30 19:28 UTC (permalink / raw)
  To: davem, linux-kernel, linux-rdma, netdev, rds-devel,
	santosh.shilimkar, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    6201f31a39f8 Add linux-next specific files for 20181030
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1397d06d400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2a22859d870756c1
dashboard link: https://syzkaller.appspot.com/bug?extid=26de17458aeda9d305d8
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10bb52eb400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=118bdfc5400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com

WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316  
rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+  
#101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  panic+0x2ad/0x55c kernel/panic.c:188
  __warn.cold.8+0x20/0x45 kernel/panic.c:540
  report_bug+0x254/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
  do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48  
83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44  
89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
  rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
  rds_cmsg_send net/rds/send.c:971 [inline]
  rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
  sock_sendmsg_nosec net/socket.c:622 [inline]
  sock_sendmsg+0xd5/0x120 net/socket.c:632
  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
  __sys_sendmsg+0x11d/0x280 net/socket.c:2155
  __do_sys_sendmsg net/socket.c:2164 [inline]
  __se_sys_sendmsg net/socket.c:2162 [inline]
  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44a859
Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: WARNING in rds_message_alloc_sgs
From: Santosh Shilimkar @ 2018-10-30 19:38 UTC (permalink / raw)
  To: syzbot, linux-rdma, netdev, rds-devel, syzkaller-bugs; +Cc: davem, linux-kernel
In-Reply-To: <0000000000003c6b7b0579772ff3@google.com>

On 10/30/2018 12:28 PM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    6201f31a39f8 Add linux-next specific files for 20181030
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1397d06d400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2a22859d870756c1
> dashboard link: 
> https://syzkaller.appspot.com/bug?extid=26de17458aeda9d305d8
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10bb52eb400000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=118bdfc5400000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
> 
> WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 
> rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
> Kernel panic - not syncing: panic_on_warn set ...
Looks like this kernel build has panic on warn enabled which
triggers panic for " WARN_ON(!nr_pages)" case. Will look into
it. Thanks !!

Regards,
Santosh

^ permalink raw reply

* Re: [BUG] MVPP2 driver exploding in presence of a tap interface
From: Antoine Tenart @ 2018-10-30 10:50 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Petazzoni, Maxime Chevallier, Antoine Tenart,
	Marcin Wojtas, linux-arm-kernel@lists.infradead.org,
	netdev@vger.kernel.org
In-Reply-To: <6355174d-4ab6-595d-17db-311bce607aef@arm.com>

Marc,

On Mon, Oct 29, 2018 at 03:05:53PM +0000, Marc Zyngier wrote:
> 
> This is a follow-up on the conversation Thomas and I had last week at 
> ELC, with me ranting at the sorry state of the MVPP2 driver.

> Triggering this is dead simple:
> - Add a macvtap to one of the MVPP2 interfaces
> - Bring it online
> - Watch the kernel exploding and memory being corrupted
> 
> You don't even need anything listening on the tap interface, just its
> simple existence triggers it. I use a similar setup on a large variety 
> of machines, and this box is the only one that catches fire. Removing
> the macvtap interface makes it (more) reliable.
> 
> Given that I cannot reproduce this issue on any other ARM (32 or 64bit)
> platform, including other Marvell stuff, I can only conclude that the
> MVPP2 driver is responsible for this.
> 
> Example crash and .config below (4.19 vanilla, as linux/master dies in
> new and wonderful ways on this box). I'm looking forward to testing any
> idea you may have.

I used a 4.19 vanilla kernel, with both your configuration and mine,
on 2 different Macchiatobins, but was unable to trigger the issue:

  # ip link set eth0 up
  # ip link add link eth0 name macvtap0 type macvtap
  # ip link set macvtap0 up

I can even configure the eth0/macvtap0 interfaces, and use them
generating or receiving tcp/udp/icmp traffic.

(I also made other tests using macvtap and tap interfaces).

How much memory do you have on the board? What version of ATF are you
using? Version of U-Boot?

Antoine

-- 
Antoine Ténart, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: Fw: [Bug 201423] New: eth0: hw csum failure
From: Andre Tomt @ 2018-10-30 10:58 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet
  Cc: Stephen Hemminger, netdev, rossi.f, Dimitris Michailidis
In-Reply-To: <91545596-f932-8834-f613-feda3edc9b84@tomt.net>

On 27.10.2018 23:41, Andre Tomt wrote:
> On 26.10.2018 13:45, Andre Tomt wrote:
>> On 25.10.2018 19:38, Eric Dumazet wrote:
>>>
>>>
>>> On 10/24/2018 12:41 PM, Andre Tomt wrote:
>>>>
>>>> It eventually showed up again with mlx4, on 4.18.16 + fix and also 
>>>> on 4.19. I still do not have a useful packet capture.
>>>>
>>>> It is running a torrent client serving up various linux distributions.
>>>>
>>>
>>> Have you also applied this fix ?
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 
>>>
>>>
>>
>> No. I've applied it now to 4.19 and will report back if anything shows 
>> up.
> 
> Just hit it on the simpler server; no VRF, no tunnels, no nat/conntrack. 
> Only a basic stateless nftables ruleset and a vlan netdev (unlikely to 
> be the one triggering this I guess; it has only v4 traffic).

I'm currently testing 4.19 with the recomended commit added, plus these 
to sort out some GRO issues (on a hunch, unsure if related):
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894

and I *think* it is behaving better now? it's not conclusive as it could 
take a while to trip in this environment but some of the test servers 
have not shown anything bad in almost 24h.

^ permalink raw reply

* Re: Fw: [Bug 201423] New: eth0: hw csum failure
From: Andre Tomt @ 2018-10-30 11:04 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet
  Cc: Stephen Hemminger, netdev, rossi.f, Dimitris Michailidis
In-Reply-To: <869fbb53-a0a5-95f9-2c77-c3ae3f6d181f@tomt.net>

On 30.10.2018 11:58, Andre Tomt wrote:
> On 27.10.2018 23:41, Andre Tomt wrote:
>> On 26.10.2018 13:45, Andre Tomt wrote:
>>> On 25.10.2018 19:38, Eric Dumazet wrote:
>>>>
>>>>
>>>> On 10/24/2018 12:41 PM, Andre Tomt wrote:
>>>>>
>>>>> It eventually showed up again with mlx4, on 4.18.16 + fix and also 
>>>>> on 4.19. I still do not have a useful packet capture.
>>>>>
>>>>> It is running a torrent client serving up various linux distributions.
>>>>>
>>>>
>>>> Have you also applied this fix ?
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 
>>>>
>>>>
>>>
>>> No. I've applied it now to 4.19 and will report back if anything 
>>> shows up.
>>
>> Just hit it on the simpler server; no VRF, no tunnels, no 
>> nat/conntrack. Only a basic stateless nftables ruleset and a vlan 
>> netdev (unlikely to be the one triggering this I guess; it has only v4 
>> traffic).
> 
> I'm currently testing 4.19 with the recomended commit added, plus these 
> to sort out some GRO issues (on a hunch, unsure if related):
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218 
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7 
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894 
> 
> 
> and I *think* it is behaving better now? it's not conclusive as it could 
> take a while to trip in this environment but some of the test servers 
> have not shown anything bad in almost 24h.

Sorry, s/some of the/none of the

^ permalink raw reply

* Re: [PATCH net v5] net/ipv6: Add anycast addresses to a global hashtable
From: Jeff Barnhill @ 2018-10-30 11:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, Alexey Kuznetsov, yoshfuji
In-Reply-To: <20181029.203211.604037421868394185.davem@davemloft.net>

I originally started implementing it the way you suggested; however,
it seemed to complicate management of that structure because it isn't
currently using rcu.  Also, assuming that can be worked out, where
would I get the net from?  Would I need to store a copy in ifcaddr6,
or is there some way to access it during ipv6_chk_acast_addr()?  It
seems that if I don't add a copy of net, but instead access it through
aca_rt(?), then freeing the ifcaddr6 memory becomes problematic
(detaching it from idev, while read_rcu may still be accessing it).
On Mon, Oct 29, 2018 at 11:32 PM David Miller <davem@davemloft.net> wrote:
>
> From: Jeff Barnhill <0xeffeff@gmail.com>
> Date: Sun, 28 Oct 2018 01:51:59 +0000
>
> > +struct ipv6_ac_addrlist {
> > +     struct in6_addr         acal_addr;
> > +     possible_net_t          acal_pnet;
> > +     refcount_t              acal_users;
> > +     struct hlist_node       acal_lst; /* inet6_acaddr_lst */
> > +     struct rcu_head         rcu;
> > +};
>
> Please just add the hlist to ifcaddr6 instead of duplicating so much
> information and reference counters here.
>
> This seems to waste a lot of memory unnecessary and add lots of
> unnecessary object allocate/setup/destroy logic.

^ permalink raw reply

* Re: [PATCH] xfrm: Fix error return code in xfrm_output_one()
From: Steffen Klassert @ 2018-10-30 11:48 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: Herbert Xu, netdev, kernel-janitors
In-Reply-To: <1540620726-100678-1-git-send-email-weiyongjun1@huawei.com>

On Sat, Oct 27, 2018 at 06:12:06AM +0000, Wei Yongjun wrote:
> xfrm_output_one() does not return a error code when there is
> no dst_entry attached to the skb, it is still possible crash
> with a NULL pointer dereference in xfrm_output_resume(). Fix
> it by return error code -EHOSTUNREACH.
> 
> Fixes: 9e1437937807 ("xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied, thanks a lot!

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox