Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH RFC] net: dsa: Make switches VLAN aware when enslaved into a bridge
From: Ido Schimmel @ 2018-10-26 15:10 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev@vger.kernel.org, Jiri Pirko, Petr Machata,
	privat@egil-hjelmeland.no, Woojung.Huh@microchip.com,
	tristram.ha@microchip.com, Andrew Lunn, Vivien Didelot,
	David S. Miller, open list
In-Reply-To: <20181024193657.24012-1-f.fainelli@gmail.com>

On Wed, Oct 24, 2018 at 12:36:57PM -0700, Florian Fainelli wrote:
> Commit 2ea7a679ca2a ("net: dsa: Don't add vlans when vlan filtering is
> disabled") changed the behavior of DSA switches when the switch ports
> are enslaved into the bridge and only pushed the VLAN configuration down
> to the switch if the bridge is configured with VLAN filtering enabled.

This is what mlxsw is doing.

> This is unfortunately wrong, because what vlan_filtering configures is a
> policy on the acceptance of VLAN tagged frames with an unknown VID.
> 
> vlan_filtering=0 means a frame with a VLAN tag that is not part of the
> VLAN table should be allowed to ingress the switch, and vlan_fltering=1
> would reject that frame.

While you correctly describe the logic, this is not how VLAN-unaware
bridges are actually used. The expectation is that packets will be
untagged when entering the bridge. Either because they are truly
untagged or because they were untagged by a VLAN netdev.

For a long time we rejected the enslavement of physical ports to
VLAN-unaware bridges and only allowed VLAN netdevs to be enslaved. In
order to support the logic you described, we would need to map all 4K
VLANs on each port to 4K different FIDs. In addition, each FDB entry
would need to be programmed 4K times, each time with a different FID.
This is because FDB lookup is performed using {MAC, FID} and not only
MAC. I can go into more details about why we cannot map different VLANs
on a port to the same FID, but I do not think it is pertinent to our
discussion.

Eventually, users started complaining about this constraint and we
relaxed it in commit 65b53bfd497b ("mlxsw: spectrum_switchdev: Allow
port enslavement to a VLAN-unaware bridge").

P.S. Corrected Petr's mail address.

^ permalink raw reply

* Re: [PATCH net] bridge: do not add port to router list when receives query with source 0.0.0.0
From: Nikolay Aleksandrov @ 2018-10-26  7:27 UTC (permalink / raw)
  To: Hangbin Liu, netdev
  Cc: Jiri Pirko, Linus Lüssing, David S. Miller, bridge,
	Roopa Prabhu
In-Reply-To: <1540520923-17589-1-git-send-email-liuhangbin@gmail.com>

On 26/10/2018 05:28, Hangbin Liu wrote:
> Based on RFC 4541, 2.1.1.  IGMP Forwarding Rules
> 
>   The switch supporting IGMP snooping must maintain a list of
>   multicast routers and the ports on which they are attached.  This
>   list can be constructed in any combination of the following ways:
> 
>   a) This list should be built by the snooping switch sending
>      Multicast Router Solicitation messages as described in IGMP
>      Multicast Router Discovery [MRDISC].  It may also snoop
>      Multicast Router Advertisement messages sent by and to other
>      nodes.
> 
>   b) The arrival port for IGMP Queries (sent by multicast routers)
>      where the source address is not 0.0.0.0.
> 
> We should not add the port to router list when receives query with source
> 0.0.0.0.
> 
> Reported-by: Ying Xu <yinxu@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>  net/bridge/br_multicast.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 024139b..41cdafb 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -1422,7 +1422,15 @@ static void br_multicast_query_received(struct net_bridge *br,
>  		return;
>  
>  	br_multicast_update_query_timer(br, query, max_delay);
> -	br_multicast_mark_router(br, port);
> +
> +	/* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules,
> +	 * the arrival port for IGMP Queries where the source address
> +	 * is 0.0.0.0 should not be added to router port list.
> +	 */
> +	if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
> +	    (saddr->proto == htons(ETH_P_IPV6) &&
> +	     !ipv6_addr_any(&saddr->u.ip6)))
> +		br_multicast_mark_router(br, port);
>  }
>  
>  static void br_ip4_multicast_query(struct net_bridge *br,
> 

+CC Roopa & bridge@lists.linux-foundation.org

^ permalink raw reply

* KASAN: slab-out-of-bounds Read in sctp_getsockopt
From: syzbot @ 2018-10-26 16:38 UTC (permalink / raw)
  To: davem, linux-kernel, linux-sctp, marcelo.leitner, netdev, nhorman,
	syzkaller-bugs, vyasevich

Hello,

syzbot found the following crash on:

HEAD commit:    bd6bf7c10484 Merge tag 'pci-v4.20-changes' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16fd6bcb400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2dd8629d56664133
dashboard link: https://syzkaller.appspot.com/bug?extid=5da0d0a72a9e7d791748
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b3ea33400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17f9f1bd400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-out-of-bounds in sctp_getsockopt_pr_streamstatus  
net/sctp/socket.c:7174 [inline]
BUG: KASAN: slab-out-of-bounds in sctp_getsockopt+0x7516/0x7cc2  
net/sctp/socket.c:7582
Read of size 8 at addr ffff8801d89f0968 by task syz-executor278/5330

CPU: 1 PID: 5330 Comm: syz-executor278 Not tainted 4.19.0+ #303
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
  sctp_getsockopt_pr_streamstatus net/sctp/socket.c:7174 [inline]
  sctp_getsockopt+0x7516/0x7cc2 net/sctp/socket.c:7582
  sock_common_getsockopt+0x9a/0xe0 net/core/sock.c:2937
  __sys_getsockopt+0x1ad/0x390 net/socket.c:1939
  __do_sys_getsockopt net/socket.c:1950 [inline]
  __se_sys_getsockopt net/socket.c:1947 [inline]
  __x64_sys_getsockopt+0xbe/0x150 net/socket.c:1947
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445789
Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007effdb293db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 00000000006dac48 RCX: 0000000000445789
RDX: 0000000000000074 RSI: 0000000000000084 RDI: 0000000000000003
RBP: 00000000006dac40 R08: 0000000020000040 R09: 0000000000000000
R10: 0000000020000080 R11: 0000000000000246 R12: 00000000006dac4c
R13: 00007ffcfc408c6f R14: 00007effdb2949c0 R15: 00000000006dad2c

Allocated by task 5329:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
  kmem_cache_alloc_trace+0x152/0x750 mm/slab.c:3620
  kmalloc include/linux/slab.h:513 [inline]
  kzalloc include/linux/slab.h:707 [inline]
  sctp_stream_init_ext+0x4f/0xf0 net/sctp/stream.c:237
  sctp_sendmsg_to_asoc+0x1308/0x1a20 net/sctp/socket.c:1896
  sctp_sendmsg+0x13c2/0x1da0 net/sctp/socket.c:2113
  inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg+0xd5/0x120 net/socket.c:631
  __sys_sendto+0x3d7/0x670 net/socket.c:1788
  __do_sys_sendto net/socket.c:1800 [inline]
  __se_sys_sendto net/socket.c:1796 [inline]
  __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1796
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 3223:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kfree+0xcf/0x230 mm/slab.c:3813
  kzfree+0x28/0x30 mm/slab_common.c:1543
  aa_free_file_ctx security/apparmor/include/file.h:76 [inline]
  apparmor_file_free_security+0x133/0x1a0 security/apparmor/lsm.c:448
  security_file_free+0x4a/0x80 security/security.c:900
  file_free fs/file_table.c:54 [inline]
  __fput+0x4e8/0xa30 fs/file_table.c:294
  ____fput+0x15/0x20 fs/file_table.c:309
  task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
  do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801d89f0900
  which belongs to the cache kmalloc-96 of size 96
The buggy address is located 8 bytes to the right of
  96-byte region [ffff8801d89f0900, ffff8801d89f0960)
The buggy address belongs to the page:
page:ffffea0007627c00 count:1 mapcount:0 mapping:ffff8801da8004c0 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0007646748 ffffea0007613488 ffff8801da8004c0
raw: 0000000000000000 ffff8801d89f0000 0000000100000020 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8801d89f0800: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
  ffff8801d89f0880: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
> ffff8801d89f0900: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
                                                           ^
  ffff8801d89f0980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
  ffff8801d89f0a00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: [RFC] net: stmmac: RX Jumbo packet size > 8191 problem
From: Jose Abreu @ 2018-10-26  8:27 UTC (permalink / raw)
  To: thor.thayer, Giuseppe CAVALLARO, alexandre.torgue, jose.abreu,
	netdev
In-Reply-To: <25eeec13-8aed-b715-2f06-54dbf04825d4@linux.intel.com>



On 25-10-2018 21:41, Thor Thayer wrote:
> Hi,
>
> I'm running into a weird issue at the DMA boundary for large
> packets (>8192) that I can't explain.  I'm hoping someone here
> has an idea on why I'm seeing this issue.
>
> This is the Synopsys DesignWare Ethernet GMAC core (3.74) using
> the stmmac driver found at drivers/net/ethernet/stmicro/stmmac.
>
> If I ping with data sizes that exceed the first DMA buffer size
> (size set to 8191), ping reports a data mismatch as follows at
> byte #8144:
>
> $ ping -c 1 -M do -s 8150 192.168.1.99
> PING 192.168.1.99 (192.168.1.99) 8150(8178) bytes of data.
> 8158 bytes from 192.168.1.99: icmp_seq=1 ttl=64 time=0.669 ms
> wrong data byte #8144 should be 0xd0 but was 0x0
> #16    10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22
> 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
> %< ---------------snip--------------------------------------
> #8112    b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1
> c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
> #8144    0 0 0 0 d0 d1
>         ^^^^^^^
> Notice the 4 bytes of 0 there before the expected byte of d0. I
> confirmed the on-wire result with wireshark - same data packet
> as shown above.
>
> Looking at the queue, I'm seeing these values in the RX
> descriptors (I'm using ring mode, enhanced descriptors).
> 0xa0040320 0x9fff1fff 0x7a358042 0x7a35a042
>  ^des0      ^des1      ^des2      ^desc3
>
> desc0 => 8196 bytes, OWN, First & Last Descriptor, Frame type =
> Eth
> desc1 => Disable IRQ on done, Rx Buffer2 sz = 8191, Rx Buffer1
> sz = 8191
> desc2 => Buffer 1 Addr Pointer
> desc3 => Buffer 2 Addr Pointer
>
> If I adjust init_desc3() and refill_desc3() to initialize desc3
> to desc2+BUF_SIZE_8KiB-4, I get a descriptor as show below and
> ping completes successfully.
> 0xa0040320 0x9fff1fff 0x77df8042 0x77dfa03e
>                                   ^ this is now different
>
> But I'm not sure why the -4 works because desc3 overlaps into
> the end of the first DMA buffer area (des2) which is
> counterintuitive.

By databook you have to set buffer size as multiple of bus width
but you are setting 8191 so this is not correct.

Can you try changing ehn_desc_rx_set_on_ring() and remove the
subtraction, as well as in enh_desc_init_rx_desc() ?

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: [PATCH net-next] net/ncsi: Add NCSI Mellanox OEM command
From: Vijay Khemka @ 2018-10-26 17:19 UTC (permalink / raw)
  To: David Miller
  Cc: sam@mendozajonas.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, openbmc@lists.ozlabs.org,
	Justin.Lee1@Dell.com, joel@jms.id.au,
	linux-aspeed@lists.ozlabs.org
In-Reply-To: <20181025.155335.2132843393553452471.davem@davemloft.net>

Thanks David,
Do you have any timeline when it is going to open next or how do I know.

Regards
-Vijay

On 10/25/18, 3:54 PM, "David Miller" <davem@davemloft.net> wrote:

    From: Vijay Khemka <vijaykhemka@fb.com>
    Date: Thu, 25 Oct 2018 15:04:13 -0700
    
    > This patch adds OEM Mellanox commands and response handling. It also
    > defines OEM Get MAC Address handler to get and configure the device.
    > 
    > ncsi_oem_gma_handler_mlx: This handler send NCSI mellanox command for
    > getting mac address.
    > ncsi_rsp_handler_oem_mlx: This handles response received for all
    > mellanox OEM commands.
    > ncsi_rsp_handler_oem_mlx_gma: This handles get mac address response and
    > set it to device.
    > 
    > Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
    
    net-next is closed, please resubmit this when the net-next tree opens
    back up.
    
    Thank you.
    


^ permalink raw reply

* RE: [PATCH net-next v2 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
From: Justin.Lee1 @ 2018-10-26 17:25 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <20181023215201.27315-6-sam@mendozajonas.com>

Hi Samuel,

I noticed a few issues and commented below.

Thanks,
Justin

>  /* Resources */
> +int ncsi_reset_dev(struct ncsi_dev *nd);
>  void ncsi_start_channel_monitor(struct ncsi_channel *nc);
>  void ncsi_stop_channel_monitor(struct ncsi_channel *nc);
>  struct ncsi_channel *ncsi_find_channel(struct ncsi_package *np,
> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> index 014321ad31d3..9bad03e3fa5e 100644
> --- a/net/ncsi/ncsi-manage.c
> +++ b/net/ncsi/ncsi-manage.c
> @@ -550,8 +550,10 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
>  		spin_lock_irqsave(&nc->lock, flags);
>  		nc->state = NCSI_CHANNEL_INACTIVE;
>  		spin_unlock_irqrestore(&nc->lock, flags);
> -		ncsi_process_next_channel(ndp);
> -
> +		if (ndp->flags & NCSI_DEV_RESET)
> +			ncsi_reset_dev(nd);
> +		else
> +			ncsi_process_next_channel(ndp);
>  		break;
>  	default:
>  		netdev_warn(nd->dev, "Wrong NCSI state 0x%x in suspend\n",
> @@ -1554,7 +1556,7 @@ int ncsi_start_dev(struct ncsi_dev *nd)
>  		return 0;
>  	}
>  
> -	return ncsi_choose_active_channel(nd);
> +	return ncsi_reset_dev(nd);

If there is no available channel due to the whitelist, ncsi_start_dev() function will return failed
Status and the network interface may fail to bring up too. It is possible for user to disable all 
channels and leave the interface up for checking the LOM status.

>  }
>  EXPORT_SYMBOL_GPL(ncsi_start_dev);

Also, if I send set_package_mask and set_channel_mask commands back to back in a program,
the state machine doesn't work well. If I use command line and wait for it to complete for 
each step, then it is fine.

npcm7xx-emc f0825000.eth eth2: NCSI: Multi-package enabled on ifindex 2, mask 0x00000001
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 0
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0400
npcm7xx-emc f0825000.eth eth2: NCSI: pkg 0 ch 0 set as preferred channel
npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000003
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 1
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 1 state 0400
npcm7xx-emc f0825000.eth eth2: NCSI: Package 1 set to all channels disabled
npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000000
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel()
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 0
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass pkg whitelist
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 0
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 1
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - next pkg
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 1
npcm7xx-emc f0825000.eth eth2: NCSI: No channel found to configure!
npcm7xx-emc f0825000.eth eth2: NCSI interface down
npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
npcm7xx-emc f0825000.eth eth2: Wrong NCSI state 0x100 in workqueue

All masks are set correctly, but you can see the PS column is not right and channel doesn't
configure correctly.

/sys/kernel/debug/ncsi_protocol# cat ncsi_device_status
IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC PS LS RU CR NQ HA
===================================================================
  2   eth2   ncsi0  000 000 1  1  1  1  1  1  1  0  1  1  1  0  1
  2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  0  1  1  1  0  1
  2   eth2   ncsi2  001 000 0  0  1  1  0  0  0  0  1  1  1  0  1
  2   eth2   ncsi3  001 001 0  0  1  1  0  0  0  0  1  1  1  0  1
===================================================================
MP: Multi-mode Package     WP: Whitelist Package
MC: Multi-mode Channel     WC: Whitelist Channel
PC: Primary Channel
PS: Poll Status
LS: Link Status
RU: Running
CR: Carrier OK
NQ: Queue Stopped
HA: Hardware Arbitration

PS column is getting from (int)nc->monitor.enabled.

^ permalink raw reply

* Re: [RFC net-next v2 2/8] net: add netif_is_geneve()
From: Sergei Shtylyov @ 2018-10-26  8:51 UTC (permalink / raw)
  To: John Hurley, netdev, oss-drivers, jiri, gerlitz.or, ozsh,
	jakub.kicinski, simon.horman, avivh
In-Reply-To: <1540470417-14803-3-git-send-email-john.hurley@netronome.com>

Hello!

On 25.10.2018 15:26, John Hurley wrote:

> Add a helper function to determine if the type of a netdev is geneve based
> on its rtnl_link_ops. This allows drivers that may wish to ofload tunnels

    Offload?

> to check the underlying type of the device.
>
> A recent patch added a similar helper to vxlan.h
>
> Signed-off-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next] net/ncsi: Add NCSI Mellanox OEM command
From: David Miller @ 2018-10-26 17:36 UTC (permalink / raw)
  To: vijaykhemka
  Cc: sam, netdev, linux-kernel, openbmc, Justin.Lee1, joel,
	linux-aspeed
In-Reply-To: <F0024535-93C3-4470-98A6-F57426587474@fb.com>

From: Vijay Khemka <vijaykhemka@fb.com>
Date: Fri, 26 Oct 2018 17:19:49 +0000

> Do you have any timeline when it is going to open next or how do I
> know.

I always announce net-next openning and closing here on the list.

There is also a web site:

	http://vger.kernel.org/~davem/net-next.html

Thanks.

^ permalink raw reply

* application
From: Kelvin Quarterman   @ 2018-10-26  9:07 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 178 bytes --]

Howdy,
My name is Kelvin Quarterman   and I'm interested in a position.

I've attached a copy of my CV.
The password is "1234"

Best regards!

--
Kelvin Quarterman  

[-- Attachment #2: Kelvin Quarterman   Resume.doc --]
[-- Type: application/msword, Size: 39256 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] Bluetooth: Add quirk for reading BD_ADDR from fwnode property
From: Matthias Kaehlcke @ 2018-10-26 17:58 UTC (permalink / raw)
  To: Balakrishna Godavarthi
  Cc: Marcel Holtmann, Johan Hedberg, David S . Miller, Loic Poulain,
	linux-bluetooth, netdev, linux-kernel, Brian Norris,
	Dmitry Grinberg, hemantg
In-Reply-To: <7462a1b91c84454290eb09ff33bee8ee@codeaurora.org>

On Fri, Oct 26, 2018 at 10:31:37AM +0530, Balakrishna Godavarthi wrote:
> Hi Matthias,
> 
> I missed to add a point here.
> 
> On 2018-10-25 20:06, Balakrishna Godavarthi wrote:
> > On 2018-10-25 05:51, Matthias Kaehlcke wrote:
> > > Add HCI_QUIRK_USE_BDADDR_PROPERTY to allow controllers to retrieve
> > > the public Bluetooth address from the firmware node property
> > > 'local-bd-address'. If quirk is set and the property does not exist
> > > or is invalid the controller is marked as unconfigured.
> > > 
> > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > > ---
> > > hci_dev_get_bd_addr_from_property() currently assumes that the
> > > firmware node with 'local-bd-address' is from hdev->dev.parent, not
> > > sure if this universally true. However if it is true for existing
> > > device that might use this interface we can assume this for now
> > > (unless there is a clear solution now), and cross the bridge of
> > > finding an alternative when we actually encounter the situation.
> > > One option could be to look for the first parent that has a fwnode.
> > > ---
> > >  include/net/bluetooth/hci.h | 12 +++++++++++
> > >  net/bluetooth/hci_core.c    | 42
> > > +++++++++++++++++++++++++++++++++++++
> > >  net/bluetooth/mgmt.c        |  6 ++++--
> > >  3 files changed, 58 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> > > index cdd9f1fe7cfa..a5d748099752 100644
> > > --- a/include/net/bluetooth/hci.h
> > > +++ b/include/net/bluetooth/hci.h
> > > @@ -158,6 +158,18 @@ enum {
> > >  	 */
> > >  	HCI_QUIRK_INVALID_BDADDR,
> > > 
> > > +	/* When this quirk is set, the public Bluetooth address
> > > +	 * initially reported by HCI Read BD Address command
> > > +	 * is considered invalid. The public BD Address can be
> > > +	 * specified in the fwnode property 'local-bd-address'.
> > > +	 * If this property does not exist or is invalid controller
> > > +	 * configuration is required before this device can be used.
> > > +	 *
> > > +	 * This quirk can be set before hci_register_dev is called or
> > > +	 * during the hdev->setup vendor callback.
> > > +	 */
> > > +	HCI_QUIRK_USE_BDADDR_PROPERTY,
> > > +
> > >  	/* When this quirk is set, the duplicate filtering during
> > >  	 * scanning is based on Bluetooth devices addresses. To allow
> > >  	 * RSSI based updates, restart scanning if needed.
> > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > > index 74b29c7d841c..97214262c4fb 100644
> > > --- a/net/bluetooth/hci_core.c
> > > +++ b/net/bluetooth/hci_core.c
> > > @@ -30,6 +30,7 @@
> > >  #include <linux/rfkill.h>
> > >  #include <linux/debugfs.h>
> > >  #include <linux/crypto.h>
> > > +#include <linux/property.h>
> > >  #include <asm/unaligned.h>
> > > 
> > >  #include <net/bluetooth/bluetooth.h>
> > > @@ -1355,9 +1356,40 @@ int hci_inquiry(void __user *arg)
> > >  	return err;
> > >  }
> > > 
> > > +/**
> > > + * hci_dev_get_bd_addr_from_property - Get the Bluetooth Device
> > > Address
> > > + *				       (BD_ADDR) for a HCI device from
> > > + *				       a firmware node property.
> > > + * @hdev:	The HCI device
> > > + *
> > > + * Search the firmware node for 'local-bd-address'.
> > > + *
> > > + * All-zero BD addresses are rejected, because those could be
> > > properties
> > > + * that exist in the firmware tables, but were not updated by the
> > > firmware. For
> > > + * example, the DTS could define 'local-bd-address', with zero BD
> > > addresses.
> > > + */
> > > +static int hci_dev_get_bd_addr_from_property(struct hci_dev *hdev)
> > > +{
> > > +	struct fwnode_handle *fwnode = dev_fwnode(hdev->dev.parent);
> > > +	bdaddr_t ba;
> > > +	int ret;
> > > +
> > > +	ret = fwnode_property_read_u8_array(fwnode, "local-bd-address",
> > > +					    (u8 *)&ba, sizeof(ba));
> > > +	if (ret < 0)
> > > +		return ret;
> > > +	if (!bacmp(&ba, BDADDR_ANY))
> > > +		return -ENODATA;
> > > +
> > > +	hdev->public_addr = ba;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int hci_dev_do_open(struct hci_dev *hdev)
> > >  {
> > >  	int ret = 0;
> > > +	bool bd_addr_set = false;
> > > 
> > >  	BT_DBG("%s %p", hdev->name, hdev);
> > > 
> > > @@ -1422,6 +1454,16 @@ static int hci_dev_do_open(struct hci_dev
> > > *hdev)
> > >  		if (hdev->setup)
> > >  			ret = hdev->setup(hdev);
> > > 
> > > +		if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) {
> > > +			if (!hci_dev_get_bd_addr_from_property(hdev))
> > > +				if (hdev->set_bdaddr &&
> > > +				    !hdev->set_bdaddr(hdev, &hdev->public_addr))
> > > +					bd_addr_set = true;
> 
> Can we check the return status of hdev->setup() before calling
> hdev->set_bdaddr().
> some vendors assign hdev->set_baddr helper before calling hdev->setup().
> https://elixir.bootlin.com/linux/v4.19-rc8/source/drivers/bluetooth/btqcomsmd.c#L194
> There will no use in calling hdev->set_baddr() if hdev->setup() fails.

Thanks for pointing this out, I'll add the check.

This is more a question for Marcel: independently from this change I
wonder how robust the error flow in this function is. Is there are
reason to not bail out directly when a seemingly vital function like
->setup() fails, and instead continue and potentially overwrite the
error code? And there are other similar patterns in hci_dev_do_open().

Bailing out would certainly add a bit more code and probably gotos to
a cleanup section (currently in the else branch at the bottom of the
function), but might improve readability and robustness (I don't claim
there is an actual problem, but overwriting the error code seems
brittle).

Cheers

Matthias

^ permalink raw reply

* CAKE and r8169 cause panic on upload in v4.19
From: Oleksandr Natalenko @ 2018-10-26 19:26 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Dave Taht, David S. Miller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, netdev, linux-kernel

Hello.

I was excited regarding the fact that v4.19 introduced CAKE, so I've 
deployed it on my home router.

I used this script of mine [1]:

# bufferbloat enp3s0.100 20 20

to do its job on the VLAN interface, where 20/20 ISP link is switched 
from the home switch. Basically, it just follows [2] with simple 
bandwidth restriction and egress mirroring using ifb.

Then I thought it would be nice to run speedtest-cli on one of the 
computer in the home LAN, connected to this router. Download stage went 
fine, but immediately after upload started I've got a panic on the 
router: [3] (sorry, it is a photo, netconsole didn't work because, I 
assume, the panic happened in the networking code). I rebooted the 
router and tried once more, and got the same result, again during upload 
stage. Then I rebooted again, replaced CAKE script with my former HTB 
script, and after running speedtest-cli a couple of times there's no 
panic.

Before running speedtest-cli I was using CAKE for a couple of days 
without generating much traffic just fine. It seems it crashes only if 
lots of traffic is generated with tools like this.

My sysctl: [4] and ethtool -k: [5]

So far, I've found something similar only here: [6] [7]. The common 
thing is r8169 driver in use, so, maybe, it is a driver issue, and CAKE 
is just happy to reveal it.

If it is something known, please point me to a possible fix. If it is 
something new, I'm open to provide more info on your request, try 
patches etc (as usual).

Thanks.

-- 
   Oleksandr Natalenko (post-factum)

[1] https://gist.github.com/4b27c49a7f9b4d775e2e38ba23d3f13c
[2] https://www.bufferbloat.net/projects/codel/wiki/Cake
[3] https://bit.ly/2SlUl7R
[4] https://gist.github.com/pfactum/bdad2594b151578f460857cacd94c689
[5] https://gist.github.com/pfactum/cad2cc5d1512b31fbc76d821b3e63dbf
[6] https://boards.4chan.org/g/thread/68171835#p68188019
[7] https://i.4cdn.org/g/1540307271879.jpg

^ permalink raw reply

* [PATCH] net: allow traceroute with a specified interface in a vrf
From: Mike Manning @ 2018-10-26 11:24 UTC (permalink / raw)
  To: netdev

Traceroute executed in a vrf succeeds if no device is given or if the
vrf is given as the device, but fails if the interface is given as the
device. This is for default UDP probes, it succeeds for TCP SYN or ICMP
ECHO probes. As the skb bound dev is the interface and the sk dev is
the vrf, sk lookup fails for ICMP_DEST_UNREACH and ICMP_TIME_EXCEEDED
messages. The solution is for the secondary dev to be passed so that
the interface is available for the device match to succeed, in the same
way as is already done for non-error cases.

Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
---
 net/ipv4/udp.c | 4 ++--
 net/ipv6/udp.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 1f5e78d1477d..c9bc08915153 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -676,8 +676,8 @@ void __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	struct net *net = dev_net(skb->dev);
 
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
-			       iph->saddr, uh->source, skb->dev->ifindex, 0,
-			       udptable, NULL);
+			       iph->saddr, uh->source, skb->dev->ifindex,
+			       inet_sdif(skb), udptable, NULL);
 	if (!sk) {
 		__ICMP_INC_STATS(net, ICMP_MIB_INERRORS);
 		return;	/* No socket for error */
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4f0a8728d723..740be1fbd4f5 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -543,7 +543,7 @@ void __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	struct net *net = dev_net(skb->dev);
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
-			       inet6_iif(skb), 0, udptable, skb);
+			       inet6_iif(skb), inet6_sdif(skb), udptable, skb);
 	if (!sk) {
 		__ICMP6_INC_STATS(net, __in6_dev_get(skb->dev),
 				  ICMP6_MIB_INERRORS);
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net] net: sched: Remove TCA_OPTIONS from policy
From: Marco Berizzi @ 2018-10-26 11:34 UTC (permalink / raw)
  To: David Ahern; +Cc: davem, netdev
In-Reply-To: <20181024153249.15374-1-dsahern@kernel.org>

> Il 24 ottobre 2018 alle 17.32 David Ahern <dsahern@kernel.org> ha scritto:
>  net/sched/sch_api.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 3dc0acf54245..be7cd140b2a3 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1309,7 +1309,6 @@ check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w)
> 
> const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = {
>  [TCA_KIND] = { .type = NLA_STRING },
> 
> *   [TCA_OPTIONS] = { .type = NLA_NESTED },
> [TCA_RATE] = { .type = NLA_BINARY,
>  .len = sizeof(struct tc_estimator) },
> [TCA_STAB] = { .type = NLA_NESTED },
> --
> 2.11.0

David,

Apologies for bothering you again.
I applied your patch to 4.19, but after issuing this
command:

root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
root@Calimero:~# ping 10.81.104.1
PING 10.81.104.1 (10.81.104.1) 56(84) bytes of data.
^C
--- 10.81.104.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1001ms

I'm losing ipv4 connectivity.
If I remove the qdisc everything is going to work again:

root@Calimero:~# tc qdisc del dev eth0 root                   
root@Calimero:~# ping 10.81.104.1
PING 10.81.104.1 (10.81.104.1) 56(84) bytes of data.
64 bytes from 10.81.104.1: icmp_seq=1 ttl=255 time=0.711 ms
^C
--- 10.81.104.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.711/0.711/0.711/0.000 ms

^ permalink raw reply

* Re: CAKE and r8169 cause panic on upload in v4.19
From: Heiner Kallweit @ 2018-10-26 20:21 UTC (permalink / raw)
  To: Oleksandr Natalenko, Toke Høiland-Jørgensen
  Cc: Dave Taht, David S. Miller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, netdev, linux-kernel
In-Reply-To: <61d09f0db41f269cc9ee13dd68a5c285@natalenko.name>

On 26.10.2018 21:26, Oleksandr Natalenko wrote:
> Hello.
> 
> I was excited regarding the fact that v4.19 introduced CAKE, so I've deployed it on my home router.
> 
> I used this script of mine [1]:
> 
> # bufferbloat enp3s0.100 20 20
> 
> to do its job on the VLAN interface, where 20/20 ISP link is switched from the home switch. Basically, it just follows [2] with simple bandwidth restriction and egress mirroring using ifb.
> 
> Then I thought it would be nice to run speedtest-cli on one of the computer in the home LAN, connected to this router. Download stage went fine, but immediately after upload started I've got a panic on the router: [3] (sorry, it is a photo, netconsole didn't work because, I assume, the panic happened in the networking code). I rebooted the router and tried once more, and got the same result, again during upload stage. Then I rebooted again, replaced CAKE script with my former HTB script, and after running speedtest-cli a couple of times there's no panic.
> 
> Before running speedtest-cli I was using CAKE for a couple of days without generating much traffic just fine. It seems it crashes only if lots of traffic is generated with tools like this.
> 
> My sysctl: [4] and ethtool -k: [5]
> 
> So far, I've found something similar only here: [6] [7]. The common thing is r8169 driver in use, so, maybe, it is a driver issue, and CAKE is just happy to reveal it.
> 
> If it is something known, please point me to a possible fix. If it is something new, I'm open to provide more info on your request, try patches etc (as usual).
> 
It seems to be the same problem as described here: https://bugzilla.kernel.org/show_bug.cgi?id=201063
As I commented in bugzilla, the GPF in dev_hard_start_xmit and the values of R12/R15 make me think
that a poisoned list pointer is accessed. It's so deep in the network stack that I can not really
imagine the network driver is to blame. One screenshot attached to the bug report shows that the
GPF also happened with the igb driver. Most likely we find out only once somebody spends effort
on bisecting the issue.
d4546c2509b1 ("net: Convert GRO SKB handling to list_head.") and some subsequent changes deal with
skb list processing, maybe the issue is related to one of these changes.

> Thanks.
> 

^ permalink raw reply

* Re: Fw: [Bug 201423] New: eth0: hw csum failure
From: Andre Tomt @ 2018-10-26 11:45 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet
  Cc: Stephen Hemminger, netdev, rossi.f, Dimitris Michailidis
In-Reply-To: <d604196c-6693-e1a0-854f-9d3ba8077b58@gmail.com>

On 25.10.2018 19:38, Eric Dumazet wrote:
> 
> 
> On 10/24/2018 12:41 PM, Andre Tomt wrote:
>>
>> It eventually showed up again with mlx4, on 4.18.16 + fix and also on 4.19. I still do not have a useful packet capture.
>>
>> It is running a torrent client serving up various linux distributions.
>>
> 
> Have you also applied this fix ?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913
> 

No. I've applied it now to 4.19 and will report back if anything shows up.

^ permalink raw reply

* Re: CAKE and r8169 cause panic on upload in v4.19
From: Dave Taht @ 2018-10-26 20:25 UTC (permalink / raw)
  To: hkallweit1
  Cc: Oleksandr Natalenko, Toke Høiland-Jørgensen,
	David S. Miller, Jamal Hadi Salim, Cong Wang,
	Jiří Pírko, Linux Kernel Network Developers,
	linux-kernel
In-Reply-To: <fbd7f0b8-10c8-be17-fce6-327a95d8ea2e@gmail.com>

On Fri, Oct 26, 2018 at 1:21 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>
> On 26.10.2018 21:26, Oleksandr Natalenko wrote:
> > Hello.
> >
> > I was excited regarding the fact that v4.19 introduced CAKE, so I've deployed it on my home router.
> >
> > I used this script of mine [1]:
> >
> > # bufferbloat enp3s0.100 20 20
> >
> > to do its job on the VLAN interface, where 20/20 ISP link is switched from the home switch. Basically, it just follows [2] with simple bandwidth restriction and egress mirroring using ifb.
> >
> > Then I thought it would be nice to run speedtest-cli on one of the computer in the home LAN, connected to this router. Download stage went fine, but immediately after upload started I've got a panic on the router: [3] (sorry, it is a photo, netconsole didn't work because, I assume, the panic happened in the networking code). I rebooted the router and tried once more, and got the same result, again during upload stage. Then I rebooted again, replaced CAKE script with my former HTB script, and after running speedtest-cli a couple of times there's no panic.
> >
> > Before running speedtest-cli I was using CAKE for a couple of days without generating much traffic just fine. It seems it crashes only if lots of traffic is generated with tools like this.
> >
> > My sysctl: [4] and ethtool -k: [5]
> >
> > So far, I've found something similar only here: [6] [7]. The common thing is r8169 driver in use, so, maybe, it is a driver issue, and CAKE is just happy to reveal it.
> >
> > If it is something known, please point me to a possible fix. If it is something new, I'm open to provide more info on your request, try patches etc (as usual).
> >
> It seems to be the same problem as described here: https://bugzilla.kernel.org/show_bug.cgi?id=201063
> As I commented in bugzilla, the GPF in dev_hard_start_xmit and the values of R12/R15 make me think
> that a poisoned list pointer is accessed. It's so deep in the network stack that I can not really
> imagine the network driver is to blame. One screenshot attached to the bug report shows that the
> GPF also happened with the igb driver. Most likely we find out only once somebody spends effort
> on bisecting the issue.
> d4546c2509b1 ("net: Convert GRO SKB handling to list_head.") and some subsequent changes deal with
> skb list processing, maybe the issue is related to one of these changes.

Can you repeat your test, disabling gro splitting in cake?

the option is "no-split-gso"

>
> > Thanks.
> >



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply

* Re: [PATCH] igb: shorten maximum PHC timecounter update interval
From: Miroslav Lichvar @ 2018-10-26 12:04 UTC (permalink / raw)
  To: Richard Cochran; +Cc: intel-wired-lan, netdev, Jacob Keller, Thomas Gleixner
In-Reply-To: <20181012140530.6mjxkb2co3nhl5pf@localhost>

On Fri, Oct 12, 2018 at 07:05:30AM -0700, Richard Cochran wrote:
> On Fri, Oct 12, 2018 at 01:13:39PM +0200, Miroslav Lichvar wrote:
> > Since commit 500462a9d ("timers: Switch to a non-cascading wheel"),
> > scheduling of delayed work seems to be less accurate and a requested
> > delay of 540 seconds may actually be longer than 550 seconds. Shorten
> > the delay to 480 seconds to be sure the timecounter is updated in time.
> 
> Good catch.  This timer wheel change will affect other, similar
> drivers.  Guess I'll go through and adjust their timeouts, too.

I just realized that we need to fit there also any frequency
adjustments of the PHC and system clock. The PHC can be set to run up
to 6% faster and the system clock can be slowed down by up to 10%.

Those 480 seconds in the igb driver is not short enough for that.
Should I fix and resend this patch, or send a new one?

Other drivers may have a similar problem.

-- 
Miroslav Lichvar

^ permalink raw reply

* Re: [PATCH v2 05/17] octeontx2-af: Config NPC KPU engines with parser profile
From: Arnd Bergmann @ 2018-10-26 12:07 UTC (permalink / raw)
  To: sunil.kovvuri; +Cc: netdev, davem, linux-soc, Sunil Goutham
In-Reply-To: <1540230964-5506-6-git-send-email-sunil.kovvuri@gmail.com>

On 10/22/18, sunil.kovvuri@gmail.com <sunil.kovvuri@gmail.com> wrote:
> From: Sunil Goutham <sgoutham@marvell.com>
>

> +struct npc_kpu_action0 {
> +#if defined(__BIG_ENDIAN_BITFIELD)
> +	u64 rsvd_63_57     : 7;
> +	u64 byp_count      : 3;
> +	u64 capture_ena    : 1;
> +	u64 parse_done     : 1;
> +	u64 next_state     : 8;
> +	u64 rsvd_43        : 1;
> +	u64 capture_lid    : 3;

This looks like it again introduces a problem on bit-endian kernels,
since you have fields that span multiple bytes. Could you rewrite
it to avoid the use of bitfields?

    Arnd

^ permalink raw reply

* Re: CAKE and r8169 cause panic on upload in v4.19
From: Oleksandr Natalenko @ 2018-10-26 20:54 UTC (permalink / raw)
  To: Dave Taht
  Cc: hkallweit1, Toke Høiland-Jørgensen, David S. Miller,
	Jamal Hadi Salim, Cong Wang, Jiří Pírko,
	Linux Kernel Network Developers, linux-kernel
In-Reply-To: <CAA93jw7xoaokrN4oGAyqh6JHAn8nJpUpp_wj-iy+GWNoV-ipNA@mail.gmail.com>

Hi.

On 26.10.2018 22:25, Dave Taht wrote:
> Can you repeat your test, disabling gro splitting in cake?
> 
> the option is "no-split-gso"

Still panics. Takes a couple of rounds, but panics.

Moreover, I've stressed my HTB setup like this too for a longer time, 
and it panics as well. So, at least, now I have a proof this is not a 
CAKE-specific thing.

Also, I've stressed it even with noqueue, and the panic is still there. 
So, this thing is not even sch-specific.

Next, I've seen GRO bits in the call trace and decided to disable GRO on 
this NIC. So far, I cannot trigger a panic with GRO disabled even after 
20 rounds of speedtest.

So, must be some generic thing indeed.

-- 
   Oleksandr Natalenko (post-factum)

^ permalink raw reply

* Re: ethernet "bus" number in DTS ?
From: Michal Suchánek @ 2018-10-24  6:22 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Joakim Tjernlund, linuxppc-dev@lists.ozlabs.org,
	netdev@vger.kernel.org, andrew@lunn.ch
In-Reply-To: <8225ba9e-ce8f-b0e0-8f3d-73783693eea4@gmail.com>

On Tue, 23 Oct 2018 11:20:36 -0700
Florian Fainelli <f.fainelli@gmail.com> wrote:

> On 10/23/18 11:02 AM, Joakim Tjernlund wrote:
> > On Tue, 2018-10-23 at 10:03 -0700, Florian Fainelli wrote:  

> > 
> > I also noted that using status = "disabled" didn't work either to
> > create a fix name scheme. Even worse, all the eth I/F after gets
> > renumbered. It seems to me there is value in having stability in
> > eth I/F naming at boot. Then userspace(udev) can rename if need be. 
> > 
> > Sure would like to known more about why this feature is not wanted ?
> > 
> > I found
> >   https://patchwork.kernel.org/patch/4122441/
> > You quote policy as reason but surely it must be better to
> > have something stable, connected to the hardware name, than
> > semirandom naming?  
> 
> If the Device Tree nodes are ordered by ascending base register
> address, my understanding is that you get the same order as far as
> platform_device creation goes, this may not be true in the future if
> Rob decides to randomize that, but AFAICT this is still true. This
> may not work well with status = disabled properties being inserted
> here and there, but we have used that here and it has worked for as
> far as I can remember doing it.

So this is unstable in several respects. First is changing the
enabled/disabled status in the deivecetrees provided by the kernel.

Second is if you have hardware hotplug mechanism either by firmware or
by loading device overlays.

Third is the case when the devicetree is not built as part of the
kernel but is instead provided by firmware that initializes the
low-level hardware details. Then the ordering by address is not
guaranteed nor is that the same address will be used to access the same
interface every time. There might be multiple ways to configure the
hardware depending on firmware configuration and/or version.

> Second, you might want to name network devices ethX, but what if I
> want to name them ethernetX or fooX or barX? Should we be accepting a
> mechanism in the kernel that would allow someone to name the
> interfaces the way they want straight from a name being provided in
> Device Tree?

Clearly if there is text Ethernet1 printed above the Ethernet port we
should provide a mechanism to name the port Ethernet1 by default.

> 
> Aliases are fine for providing relative stability within the Device
> Tree itself and boot programs that might need to modify the Device
> Tree (e.g: inserting MAC addresses) such that you don't have to
> encode logic to search for nodes by compatible strings etc. but
> outside of that use case, it seems to me that you can resolve every
> naming decision in user-space.

However, this is pushing platform-specific knowledge to userspace. The
way the Ethernet interface is attached and hence the device properties
usable for identifying the device uniquely are platform-specific. 

On the other hand, aliases are universal when provided. If they are
good enough to assign a MAC address they are good enough to provide a
stable default name.

I think this is indeed forcing the userspace to reinvent several wheels
for no good reason.

What is the problem with adding the aliases?

Thanks

Michal

^ permalink raw reply

* Re: Fw: [Bug 201423] New: eth0: hw csum failure
From: Andre Tomt @ 2018-10-26 12:38 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet
  Cc: Stephen Hemminger, netdev, rossi.f, Dimitris Michailidis
In-Reply-To: <d11e656f-0ad6-e69c-ef70-6cb17a71bc90@tomt.net>

On 26.10.2018 13:45, Andre Tomt wrote:
> On 25.10.2018 19:38, Eric Dumazet wrote:
>>
>>
>> On 10/24/2018 12:41 PM, Andre Tomt wrote:
>>>
>>> It eventually showed up again with mlx4, on 4.18.16 + fix and also on 
>>> 4.19. I still do not have a useful packet capture.
>>>
>>> It is running a torrent client serving up various linux distributions.
>>>
>>
>> Have you also applied this fix ?
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 
>>
>>
> 
> No. I've applied it now to 4.19 and will report back if anything shows up.

And it tripped again with that commit; however on another box with a 
much more complicated setup (VRFs, sch_cake, ifb, conntrack/nat, 6in4 
tunnel, VF device on mlx4)

> [ 8197.348260] wanib: hw csum failure
> [ 8197.348288] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-1 #1
> [ 8197.348289] Hardware name: Supermicro SYS-5018D-FN8T/X10SDV-TP8F, BIOS 1.3 03/19/2018
> [ 8197.348290] Call Trace:
> [ 8197.348296]  <IRQ>
> [ 8197.348304]  dump_stack+0x5c/0x80
> [ 8197.348308]  __skb_checksum_complete+0xac/0xc0
> [ 8197.348318]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
> [ 8197.348325]  ? ip_output+0x61/0xc0
> [ 8197.348328]  ? skb_copy_bits+0x13d/0x220
> [ 8197.348334]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
> [ 8197.348339]  ? ___pskb_trim+0x192/0x330
> [ 8197.348343]  nf_hook_slow+0x43/0xc0
> [ 8197.348346]  ip_rcv+0x90/0xb0
> [ 8197.348349]  ? ip_rcv_finish_core.isra.0+0x310/0x310
> [ 8197.348354]  __netif_receive_skb_one_core+0x42/0x50
> [ 8197.348357]  netif_receive_skb_internal+0x24/0xb0
> [ 8197.348361]  ifb_ri_tasklet+0x167/0x260 [ifb]
> [ 8197.348365]  tasklet_action_common.isra.3+0x49/0xb0
> [ 8197.348369]  __do_softirq+0xe7/0x2d3
> [ 8197.348372]  irq_exit+0x96/0xd0
> [ 8197.348375]  do_IRQ+0x85/0xd0
> [ 8197.348378]  common_interrupt+0xf/0xf
> [ 8197.348379]  </IRQ>
> [ 8197.348382] RIP: 0010:cpuidle_enter_state+0xb9/0x320
> [ 8197.348384] Code: e8 1c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 3e fb c0 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
> [ 8197.348386] RSP: 0018:ffff9f0441953ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd5
> [ 8197.348388] RAX: ffff9759efae0fc0 RBX: 000007749807d911 RCX: 000000000000001f
> [ 8197.348390] RDX: 000007749807d911 RSI: 000000003a2e8670 RDI: 0000000000000000
> [ 8197.348393] RBP: ffff9759efae98a8 R08: 0000000000000002 R09: 0000000000020840
> [ 8197.348396] R10: 00626b4810384abc R11: ffff9759efae01e8 R12: 0000000000000001
> [ 8197.348398] R13: ffffffff8d0ac638 R14: 0000000000000001 R15: 0000000000000000
> [ 8197.348402]  ? cpuidle_enter_state+0x94/0x320
> [ 8197.348407]  do_idle+0x1e4/0x220
> [ 8197.348411]  cpu_startup_entry+0x5f/0x70
> [ 8197.348415]  start_secondary+0x185/0x1a0
> [ 8197.348417]  secondary_startup_64+0xa4/0xb0

^ permalink raw reply

* Re: [PATCH v2 00/17] octeontx2-af: NPC parser and NIX blocks initialization
From: Arnd Bergmann @ 2018-10-26 12:54 UTC (permalink / raw)
  To: David Miller; +Cc: sunil.kovvuri, netdev, linux-soc, sgoutham
In-Reply-To: <20181022.201943.53165079906230990.davem@davemloft.net>

On 10/23/18, David Miller <davem@davemloft.net> wrote:
> From: sunil.kovvuri@gmail.com
> Date: Mon, 22 Oct 2018 23:25:47 +0530
>
>> From: Sunil Goutham <sgoutham@marvell.com>
>>
>> This patchset is a continuation to earlier submitted two patch
>> series to add a new driver for Marvell's OcteonTX2 SOC's
>> Resource virtualization unit (RVU) admin function driver.
>>
>> 1. octeontx2-af: Add RVU Admin Function driver
>>    https://www.spinics.net/lists/netdev/msg528272.html
>> 2. octeontx2-af: NPA and NIX blocks initialization
>>    https://www.spinics.net/lists/netdev/msg529163.html
>>
>> This patch series adds more NIX block configuration logic
>> and additionally adds NPC block parser profile configuration.
>> In brief below is what this series adds.
>> NIX block:
>> - Support for PF/VF to allocate/free transmit scheduler queues,
>>   maintenance and their configuration.
>> - Adds support for packet replication lists, only broadcast
>>   packets is covered for now.
>> - Defines few RSS flow algorithms for HW to distribute packets.
>>   This is not the hash algorithsm (i.e toeplitz or crc32), here SW
>>   defines what fields in packet should HW take and calculate the hash.
>> - Support for PF/VF to configure VTAG strip and capture capabilities.
>> - Reset NIXLF statastics.
>>
>> NPC block:
>> This block has multiple parser engines which support packet parsing
>> at multiple layers and generates a parse result which is further used
>> to generate a key. Based on packet field offsets in the key, SW can
>> install packet forwarding rules.
>> This patch series adds
>> - Initial parser profile to be programmed into parser engines.
>> - Default forwarding rules to forward packets to different logical
>>   interfaces having a NIXLF attached.
>> - Support for promiscuous and multicast modes.
>>
>> Changes from v1:
>>  1 Fixed kernel build failure when compiled with BIG_ENDIAN enabled.
>>    - Reported by Kbuild test robot
>>  2 Fixed a warning observed when kernel is built with
>> -Wunused-but-set-variable
>
> Series applied.

I see this has been applied, but I'd still like to understand better how the
configuration interface is expected to work once the driver is complete.

In particular, so far the interfaces all assume that configuration is
done through the mailbox between PCI devices, which could be done
from a virtual machine kernel with access to PCI, or through the use
of VFIO from a user application.

Is that the only method of configuring it that you support, or will there
also be a devlink based interface or something like that to configure
the aspects of a virtual device that should not be accessible to the
VF itself?

        Arnd

^ permalink raw reply

* Re: Fw: [Bug 201423] New: eth0: hw csum failure
From: Eric Dumazet @ 2018-10-26 12:59 UTC (permalink / raw)
  To: andre
  Cc: Eric Dumazet, Stephen Hemminger, netdev, rossi.f,
	Dimitris Michailidis
In-Reply-To: <bf33233e-67f1-a766-be08-19349d13a6e6@tomt.net>

On Fri, Oct 26, 2018 at 5:38 AM Andre Tomt <andre@tomt.net> wrote:
>
> On 26.10.2018 13:45, Andre Tomt wrote:
> > On 25.10.2018 19:38, Eric Dumazet wrote:
> >>
> >>
> >> On 10/24/2018 12:41 PM, Andre Tomt wrote:
> >>>
> >>> It eventually showed up again with mlx4, on 4.18.16 + fix and also on
> >>> 4.19. I still do not have a useful packet capture.
> >>>
> >>> It is running a torrent client serving up various linux distributions.
> >>>
> >>
> >> Have you also applied this fix ?
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913
> >>
> >>
> >
> > No. I've applied it now to 4.19 and will report back if anything shows up.
>
> And it tripped again with that commit; however on another box with a
> much more complicated setup (VRFs, sch_cake, ifb, conntrack/nat, 6in4
> tunnel, VF device on mlx4)
>
> > [ 8197.348260] wanib: hw csum failure
> > [ 8197.348288] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-1 #1
> > [ 8197.348289] Hardware name: Supermicro SYS-5018D-FN8T/X10SDV-TP8F, BIOS 1.3 03/19/2018
> > [ 8197.348290] Call Trace:
> > [ 8197.348296]  <IRQ>
> > [ 8197.348304]  dump_stack+0x5c/0x80
> > [ 8197.348308]  __skb_checksum_complete+0xac/0xc0
> > [ 8197.348318]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
> > [ 8197.348325]  ? ip_output+0x61/0xc0
> > [ 8197.348328]  ? skb_copy_bits+0x13d/0x220
> > [ 8197.348334]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
> > [ 8197.348339]  ? ___pskb_trim+0x192/0x330
> > [ 8197.348343]  nf_hook_slow+0x43/0xc0
> > [ 8197.348346]  ip_rcv+0x90/0xb0
> > [ 8197.348349]  ? ip_rcv_finish_core.isra.0+0x310/0x310
> > [ 8197.348354]  __netif_receive_skb_one_core+0x42/0x50
> > [ 8197.348357]  netif_receive_skb_internal+0x24/0xb0
> > [ 8197.348361]  ifb_ri_tasklet+0x167/0x260 [ifb]
> > [ 8197.348365]  tasklet_action_common.isra.3+0x49/0xb0
> > [ 8197.348369]  __do_softirq+0xe7/0x2d3
> > [ 8197.348372]  irq_exit+0x96/0xd0
> > [ 8197.348375]  do_IRQ+0x85/0xd0
> > [ 8197.348378]  common_interrupt+0xf/0xf
> > [ 8197.348379]  </IRQ>
> > [ 8197.348382] RIP: 0010:cpuidle_enter_state+0xb9/0x320
> > [ 8197.348384] Code: e8 1c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 3e fb c0 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
> > [ 8197.348386] RSP: 0018:ffff9f0441953ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd5
> > [ 8197.348388] RAX: ffff9759efae0fc0 RBX: 000007749807d911 RCX: 000000000000001f
> > [ 8197.348390] RDX: 000007749807d911 RSI: 000000003a2e8670 RDI: 0000000000000000
> > [ 8197.348393] RBP: ffff9759efae98a8 R08: 0000000000000002 R09: 0000000000020840
> > [ 8197.348396] R10: 00626b4810384abc R11: ffff9759efae01e8 R12: 0000000000000001
> > [ 8197.348398] R13: ffffffff8d0ac638 R14: 0000000000000001 R15: 0000000000000000
> > [ 8197.348402]  ? cpuidle_enter_state+0x94/0x320
> > [ 8197.348407]  do_idle+0x1e4/0x220
> > [ 8197.348411]  cpu_startup_entry+0x5f/0x70
> > [ 8197.348415]  start_secondary+0x185/0x1a0
> > [ 8197.348417]  secondary_startup_64+0xa4/0xb0



Very different trace , yet another bug to track .

If you can, try to remove some components from this setup.

^ permalink raw reply

* [BPF] "padded" structures are not supported by BPF
From: Krishna Chaitanya @ 2018-10-26 13:06 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann

Hi,

With below config BPF doesn't seem to support "padded" structures. Is
this a bug or expected?
Kernel Version: 4.15.0-34, Intel, Ubuntu. Below is the BPF JIT output.

struct info {
        u16 seq_num;
        u32 packet_num;
};

bpf: Failed to load program: Permission denied
0: (69) r1 = *(u16 *)(r1 +12)
1: (6b) *(u16 *)(r10 -8) = r1
2: (b7) r1 = 1000000
3: (63) *(u32 *)(r10 -4) = r1
4: (b7) r1 = 0
5: (63) *(u32 *)(r10 -12) = r1
6: (18) r1 = 0xffff8f86f2998e00
8: (bf) r2 = r10
9: (07) r2 += -12
10: (85) call bpf_map_lookup_elem#1
11: (bf) r6 = r0
12: (15) if r6 == 0x0 goto pc+13
 R0=map_value(id=0,off=0,ks=4,vs=4,imm=0)
R6=map_value(id=0,off=0,ks=4,vs=4,imm=0) R10=fp0
13: (61) r1 = *(u32 *)(r6 +0)
 R0=map_value(id=0,off=0,ks=4,vs=4,imm=0)
R6=map_value(id=0,off=0,ks=4,vs=4,imm=0) R10=fp0
14: (63) *(u32 *)(r10 -16) = r1
15: (18) r1 = 0xffff8f86f2999400
17: (bf) r2 = r10
18: (07) r2 += -16
19: (bf) r3 = r10
20: (07) r3 += -8
21: (b7) r4 = 0
22: (85) call bpf_map_update_elem#2
invalid indirect read from stack off -8+2 size 8

Traceback (most recent call last):
    b = BPF(text=bpf_source)
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 337, in __init__
    self._trace_autoload()
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 1038,
in _trace_autoload
    fn = self.load_func(func_name, BPF.TRACEPOINT)
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 377,
in load_func
    (func_name, errstr))
Exception: Failed to load BPF program tracepoint__<Snip>: Permission denied

If we add "__packed" to above struct is compiled successfully.

-- 
Thanks,
Regards,
Chaitanya T K.

^ permalink raw reply

* RE: [PATCH net-next v2 6/6] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Justin.Lee1 @ 2018-10-26 21:48 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <20181023215201.27315-7-sam@mendozajonas.com>

Hi Samuel,

There is one place that we assume the next available TX channel is under the same package.
Please see the comment below.

Thanks,
Justin


+/* Change the active Tx channel in a multi-channel setup */
+int ncsi_update_tx_channel(struct ncsi_dev_priv *ndp,
> +			   struct ncsi_package *np,
> +			   struct ncsi_channel *disable,
> +			   struct ncsi_channel *enable)
> +{
> +	struct ncsi_cmd_arg nca;
> +	struct ncsi_channel *nc;
> +	int ret = 0;
> +
> +	if (!np->multi_channel)
> +		netdev_warn(ndp->ndev.dev,
> +			    "NCSI: Trying to update Tx channel in single-channel mode\n");
> +	nca.ndp = ndp;
> +	nca.package = np->id;

If the channel may be on different package, the package ID here may not be correct
in some cases.

> +	nca.req_flags = 0;
> +
> +	/* Find current channel with Tx enabled */
> +	if (!disable) {
> +		NCSI_FOR_EACH_CHANNEL(np, nc)
> +			if (nc->modes[NCSI_MODE_TX_ENABLE].enable)
> +				disable = nc;
> +	}
> +
> +	/* Find a suitable channel for Tx */
> +	if (!enable) {
> +		if (np->preferred_channel &&
> +		    ncsi_channel_has_link(np->preferred_channel)) {
> +			enable = np->preferred_channel;
> +		} else {
> +			NCSI_FOR_EACH_CHANNEL(np, nc) {
> +				if (!(np->channel_whitelist & 0x1 << nc->id))
> +					continue;
> +				if (nc->state != NCSI_CHANNEL_ACTIVE)
> +					continue;
> +				if (ncsi_channel_has_link(nc)) {
> +					enable = nc;
> +					break;
> +				}
> +			}

When we search, we need to consider the other available channel might be on the
package.

> +		}
> +	}
> +
> +	if (disable == enable)
> +		return -1;
> +
> +	if (!enable)
> +		return -1;
> +
> +	if (disable) {
> +		nca.channel = disable->id;
> +		nca.type = NCSI_PKT_CMD_DCNT;
> +		ret = ncsi_xmit_cmd(&nca);
> +		if (ret)
> +			netdev_err(ndp->ndev.dev,
> +				   "Error %d sending DCNT\n",
> +				   ret);
> +	}

I remove the cable from ncsi0 and it doesn't failover to ncsi3 as ncsi0 and ncsi3 are not under
the same package.

cat /sys/kernel/debug/ncsi_protocol/ncsi_device_
IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
======================================================================
  2   eth2   ncsi0  000 000 1  1  1  1  1  1  1  3  0  0  1  1  0  1
  2   eth2   ncsi1  000 001 0  0  1  1  1  0  0  1  0  1  1  1  0  1
  2   eth2   ncsi2  001 000 0  0  1  1  1  0  0  1  0  1  1  1  0  1
  2   eth2   ncsi3  001 001 1  0  1  1  1  1  0  2  1  1  1  1  0  1
======================================================================
MP: Multi-mode Package     WP: Whitelist Package
MC: Multi-mode Channel     WC: Whitelist Channel
PC: Primary Channel        CS: Channel State
PS: Poll Status            LS: Link Status
RU: Running                CR: Carrier OK
NQ: Queue Stopped          HA: Hardware Arbitration

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox