Netdev List
 help / color / mirror / Atom feed
* Re: [PATCHv4 net 0/3] net: sched: ife: malformed ife packet fixes
From: David Miller @ 2018-04-23  1:12 UTC (permalink / raw)
  To: aring; +Cc: yotam.gi, jhs, xiyou.wangcong, jiri, yuvalm, netdev, kernel
In-Reply-To: <20180420191505.27633-1-aring@mojatatu.com>

From: Alexander Aring <aring@mojatatu.com>
Date: Fri, 20 Apr 2018 15:15:02 -0400

> As promised at netdev 2.2 tc workshop I am working on adding scapy support for
> tdc testing. It is still work in progress. I will submit the patches to tdc
> later (they are not in good shape yet). The good news is I have been able to
> find bugs which normal packet testing would not be able to find.
> With fuzzy testing I was able to craft certain malformed packets that IFE
> action was not able to deal with. This patch set fixes those bugs.

Series applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net] strparser: Do not call mod_delayed_work with a timeout of LONG_MAX
From: David Miller @ 2018-04-23  1:10 UTC (permalink / raw)
  To: doronrk; +Cc: tj, netdev
In-Reply-To: <20180420191111.683209-1-doronrk@fb.com>

From: Doron Roberts-Kedes <doronrk@fb.com>
Date: Fri, 20 Apr 2018 12:11:11 -0700

> struct sock's sk_rcvtimeo is initialized to
> LONG_MAX/MAX_SCHEDULE_TIMEOUT in sock_init_data. Calling
> mod_delayed_work with a timeout of LONG_MAX causes spurious execution of
> the work function. timer->expires is set equal to jiffies + LONG_MAX.
> When timer_base->clk falls behind the current value of jiffies,
> the delta between timer_base->clk and jiffies + LONG_MAX causes the
> expiration to be in the past. Returning early from strp_start_timer if
> timeo == LONG_MAX solves this problem.
> 
> Found while testing net/tls_sw recv path.
> 
> Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
> Reviewed-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH for-rc] uapi: Fix SPDX tags for files referring to the 'OpenIB.org' license
From: David Miller @ 2018-04-23  1:08 UTC (permalink / raw)
  To: jgg
  Cc: linux-rdma, kstewart, pombredanne, gregkh, tglx, swinslow,
	santosh.shilimkar, netdev, linux-kernel, davejwatson
In-Reply-To: <20180420154910.GA2519@ziepe.ca>

From: Jason Gunthorpe <jgg@mellanox.com>
Date: Fri, 20 Apr 2018 09:49:10 -0600

> Based on discussion with Kate Stewart this license is not a
> BSD-2-Clause, but is now formally identified as Linux-OpenIB
> by SPDX.
> 
> The key difference between the licenses is in the 'warranty'
> paragraph.
> 
> if_infiniband.h refers to the 'OpenIB.org' license, but
> does not include the text, instead it links to an obsolete
> web site that contains a license that matches the BSD-2-Clause
> SPX. There is no 'three clause' version of the OpenIB.org
> license.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH] hv_netvsc: select needed ucs2_string routine
From: David Miller @ 2018-04-23  1:07 UTC (permalink / raw)
  To: stephen; +Cc: netdev, sthemmin
In-Reply-To: <20180420154847.29476-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri, 20 Apr 2018 08:48:47 -0700

> The conversion of rndis friendly name to utf8 uses a standard
> kernel routine which is optional in config. Therefore build
> would fail for some configurations. Resolve by selecting needed
> library.
> 
> Fixes: 0fe554a46a0f ("hv_netvsc: propogate Hyper-V friendly name into interface alias")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

Please put 'net' or 'net-next' in your Subject lines as appropriate.

I guessed that this is net-next since the Fixes commit only
exists there.

Applied.

^ permalink raw reply

* Re: [PATCH] [net] ipv6: sr: fix NULL pointer dereference in seg6_do_srh_encap()- v4 pkts
From: David Miller @ 2018-04-23  1:06 UTC (permalink / raw)
  To: amsalam20; +Cc: dlebrun, kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <1524232685-1203-1-git-send-email-amsalam20@gmail.com>

From: Ahmed Abdelsalam <amsalam20@gmail.com>
Date: Fri, 20 Apr 2018 15:58:05 +0200

> In case of seg6 in encap mode, seg6_do_srh_encap() calls set_tun_src()
> in order to set the src addr of outer IPv6 header.
> 
> The net_device is required for set_tun_src(). However calling ip6_dst_idev()
> on dst_entry in case of IPv4 traffic results on the following bug.
> 
> Using just dst->dev should fix this BUG.
 ...
> Fixes: 8936ef7604c11 ipv6: sr: fix NULL pointer dereference when setting encap source address

Please format your Fixes: tag properly next time.  The commit header
text should be enclosed by (" ").  I fixed it up for you this time.

> Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* KASAN: null-ptr-deref Read in refcount_inc_not_zero
From: syzbot @ 2018-04-23  1:02 UTC (permalink / raw)
  To: davem, dvlasenk, linux-kernel, netdev, syzkaller-bugs,
	xiaolou4617, xiyou.wangcong

Hello,

syzbot hit the following crash on upstream commit
285848b0f4074f04ab606f1e5dca296482033d54 (Sun Apr 22 04:20:48 2018 +0000)
Merge tag 'random_for_linus_stable' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=6a35cd2d9559c909d570

So far this crash happened 1772 times on upstream.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5975533900791808
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=4813418829709312
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5008564225572864
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=1808800213120130118
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6a35cd2d9559c909d570@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
==================================================================
BUG: KASAN: null-ptr-deref in atomic_read  
include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: null-ptr-deref in refcount_inc_not_zero+0x8f/0x2d0  
lib/refcount.c:120
Read of size 4 at addr 0000000000000004 by task syzkaller633288/4488

CPU: 0 PID: 4488 Comm: syzkaller633288 Not tainted 4.17.0-rc1+ #12
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  kasan_report_error mm/kasan/report.c:352 [inline]
  kasan_report.cold.7+0x6d/0x2fe mm/kasan/report.c:412
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
  kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
  atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
  refcount_inc_not_zero+0x8f/0x2d0 lib/refcount.c:120
  refcount_inc+0x15/0x70 lib/refcount.c:153
  llc_sap_hold include/net/llc.h:116 [inline]
  llc_ui_release+0xba/0x2b0 net/llc/af_llc.c:207
  sock_release+0x96/0x1b0 net/socket.c:594
  sock_close+0x16/0x20 net/socket.c:1149
  __fput+0x34d/0x890 fs/file_table.c:209
  ____fput+0x15/0x20 fs/file_table.c:243
  task_work_run+0x1e4/0x290 kernel/task_work.c:113
  exit_task_work include/linux/task_work.h:22 [inline]
  do_exit+0x1aee/0x2730 kernel/exit.c:865
  do_group_exit+0x16f/0x430 kernel/exit.c:968
  __do_sys_exit_group kernel/exit.c:979 [inline]
  __se_sys_exit_group kernel/exit.c:977 [inline]
  __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43e878
RSP: 002b:00007ffd854075f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043e878
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004be220 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006cc160 R14: 0000000000000000 R15: 0000000000000000
==================================================================


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply

* KASAN: slab-out-of-bounds Read in __sctp_v6_cmp_addr
From: syzbot @ 2018-04-23  1:02 UTC (permalink / raw)
  To: davem, linux-kernel, linux-sctp, netdev, nhorman, syzkaller-bugs,
	vyasevich

Hello,

syzbot hit the following crash on upstream commit
83beed7b2b26f232d782127792dd0cd4362fdc41 (Fri Apr 20 17:56:32 2018 +0000)
Merge branch 'fixes' of  
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=cd494c1dd681d4d93ebb

So far this crash happened 305 times on net-next, upstream.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=6684817483628544
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=6321732692475904
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5381423422767104
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=1808800213120130118
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

==================================================================
BUG: KASAN: slab-out-of-bounds in ipv6_addr_equal include/net/ipv6.h:507  
[inline]
BUG: KASAN: slab-out-of-bounds in __sctp_v6_cmp_addr+0x4c7/0x530  
net/sctp/ipv6.c:580
Read of size 8 at addr ffff8801b58626d0 by task syzkaller106428/4452

CPU: 1 PID: 4452 Comm: syzkaller106428 Not tainted 4.17.0-rc1+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  print_address_description+0x6c/0x20b mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
  ipv6_addr_equal include/net/ipv6.h:507 [inline]
  __sctp_v6_cmp_addr+0x4c7/0x530 net/sctp/ipv6.c:580
  sctp_inet6_cmp_addr+0x169/0x1a0 net/sctp/ipv6.c:898
  sctp_bind_addr_conflict+0x28c/0x470 net/sctp/bind_addr.c:368
  sctp_get_port_local+0x9fc/0x1540 net/sctp/socket.c:7515
  sctp_do_bind+0x21c/0x5f0 net/sctp/socket.c:435
  sctp_bindx_add+0x90/0x1a0 net/sctp/socket.c:529
  sctp_setsockopt_bindx+0x2ad/0x320 net/sctp/socket.c:1058
  sctp_setsockopt+0x12c4/0x7000 net/sctp/socket.c:4227
  sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
  __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445839
RSP: 002b:00007fbe3f0fdd98 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 0000000000445839
RDX: 0000000000000064 RSI: 0000000000000084 RDI: 0000000000000004
RBP: 00000000006dac20 R08: 0000000000000010 R09: 000000000000a6fe
R10: 00000000205ba000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc1404827f R14: 00007fbe3f0fe9c0 R15: 0000000000000003

Allocated by task 4452:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
  __do_kmalloc_node mm/slab.c:3682 [inline]
  __kmalloc_node+0x47/0x70 mm/slab.c:3689
  kmalloc_node include/linux/slab.h:554 [inline]
  kvmalloc_node+0x6b/0x100 mm/util.c:421
  kvmalloc include/linux/mm.h:550 [inline]
  vmemdup_user+0x2d/0xa0 mm/util.c:186
  sctp_setsockopt_bindx+0x5d/0x320 net/sctp/socket.c:1022
  sctp_setsockopt+0x12c4/0x7000 net/sctp/socket.c:4227
  sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
  __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 2818:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kfree+0xd9/0x260 mm/slab.c:3813
  single_release+0x8f/0xb0 fs/seq_file.c:609
  __fput+0x34d/0x890 fs/file_table.c:209
  ____fput+0x15/0x20 fs/file_table.c:243
  task_work_run+0x1e4/0x290 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
  exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
  do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801b58626c0
  which belongs to the cache kmalloc-32 of size 32
The buggy address is located 16 bytes inside of
  32-byte region [ffff8801b58626c0, ffff8801b58626e0)
The buggy address belongs to the page:
page:ffffea0006d61880 count:1 mapcount:0 mapping:ffff8801b5862000  
index:0xffff8801b5862fc1
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffff8801b5862000 ffff8801b5862fc1 0000000100000032
raw: ffffea0006ddd1e0 ffffea0006dd2860 ffff8801da8001c0 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8801b5862580: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
  ffff8801b5862600: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
> ffff8801b5862680: fb fb fb fb fc fc fc fc 00 00 fc fc fc fc fc fc
                                                  ^
  ffff8801b5862700: 00 00 00 00 fc fc fc fc 00 00 04 fc fc fc fc fc
  ffff8801b5862780: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
==================================================================


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: Implement logic to automatically select HW Interface
From: David Miller @ 2018-04-23  0:59 UTC (permalink / raw)
  To: Jose.Abreu
  Cc: netdev, Joao.Pinto, Vitor.Soares, peppe.cavallaro,
	alexandre.torgue
In-Reply-To: <9d14a1f97e0b963125c898d571ff44547660e9a9.1524151243.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Thu, 19 Apr 2018 16:24:15 +0100

> @@ -0,0 +1,216 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +// Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
> +// stmmac HW Interface Handling

Please do not use C++ style comments for anything past the
SPDX identifier line.

Thank you.

^ permalink raw reply

* Re: XDP breakage with virtio due to 6870de435b90c083ae0f3f7f341287976ef56f03
From: Nikita V. Shirokov @ 2018-04-23  0:55 UTC (permalink / raw)
  To: David Ahern; +Cc: Jason Wang, daniel, netdev@vger.kernel.org
In-Reply-To: <f50aa641-09a7-4a08-498c-e7aa6ac50cc6@gmail.com>

On Sun, Apr 22, 2018 at 04:47:48PM -0600, David Ahern wrote:
> This commit breaks my FIB forwarding program:
> 
> commit 6870de435b90c083ae0f3f7f341287976ef56f03
> Author: Nikita V. Shirokov <tehnerd@tehnerd.com>
> Date:   Tue Apr 17 21:42:20 2018 -0700
> 
>     bpf: make virtio compatible w/ bpf_xdp_adjust_tail
> 
>     w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
>     well (only "decrease" of pointer's location is going to be supported).
>     changing of this pointer will change packet's size.
>     for virtio driver we need to adjust XDP_PASS handling by recalculating
>     length of the packet if it was passed to the TCP/IP stack
> 
>     Reviewed-by: Jason Wang <jasowang@redhat.com>
>     Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
>     Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> 
> ###
> 
> Some of the packets (e.g., ARP or those without a resolved neighbor) are
> passed to the networking stack. What shows up are clearly broken packets:
> 
> # tcpdump -n -i eth1
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
> 15:45:29.693238 [|ARP]
> 	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
> 15:45:30.710327 [|ARP]
> 	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
> 15:45:31.734296 [|ARP]
> 	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
> 15:45:32.908720 IP6 truncated-ip6 - 12 bytes
> missing!fe80::40c0:ceff:fe2f:3fa9 > ff02::1:ff00:2: ICMP6, neighbor
> solicitation[|icmp6]
> 15:45:33.910530 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
> ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]
> 15:45:34.934437 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
> ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]
> 15:45:35.958394 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
> ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]
> 
> Reverting the mentioned patch fixes the problem.
Hi, David.
thanks for reporting this. looks like in my calculation i've missed
vi->hdr_len during new lengths calculation (it was len = xdp->data_end -
xdp->data; but also shouldbe +vi->hdr_len). will run few more tests
before sending a fix.

^ permalink raw reply

* Re: [PATCH bpf-next,v2 1/2] bpf: add helper for getting xfrm states
From: Alexei Starovoitov @ 2018-04-23  0:34 UTC (permalink / raw)
  To: Eyal Birger; +Cc: netdev, shmulik, ast, daniel, fw, steffen.klassert
In-Reply-To: <20180420064356.7a703a1e@jimi>

On Fri, Apr 20, 2018 at 06:43:56AM +0300, Eyal Birger wrote:
> Hi,
> 
> On Wed, 18 Apr 2018 15:31:03 -0700
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > On Thu, Apr 19, 2018 at 12:58:22AM +0300, Eyal Birger wrote:
> > > This commit introduces a helper which allows fetching xfrm state
> > > parameters by eBPF programs attached to TC.
> > > 
> > > Prototype:
> > > bpf_skb_get_xfrm_state(skb, index, xfrm_state, size, flags)
> > > 
> > > skb: pointer to skb
> > > index: the index in the skb xfrm_state secpath array
> > > xfrm_state: pointer to 'struct bpf_xfrm_state'
> > > size: size of 'struct bpf_xfrm_state'
> > > flags: reserved for future extensions
> > > 
> 
> <snip>
>  
> > > +#ifdef CONFIG_XFRM
> > > +BPF_CALL_5(bpf_skb_get_xfrm_state, struct sk_buff *, skb, u32,
> > > index,
> > > +	   struct bpf_xfrm_state *, to, u32, size, u64, flags)
> > > +{
> > > +	const struct sec_path *sp = skb_sec_path(skb);
> > > +	const struct xfrm_state *x;
> > > +
> > > +	if (!sp || unlikely(index >= sp->len || flags))
> > > +		goto err_clear;
> > > +
> > > +	x = sp->xvec[index];
> > > +
> > > +	if (unlikely(size != sizeof(struct bpf_xfrm_state)))
> > > +		goto err_clear;
> > > +
> > > +	to->reqid = x->props.reqid;
> > > +	to->spi = be32_to_cpu(x->id.spi);
> > > +	to->family = x->props.family;
> > > +	if (to->family == AF_INET6) {
> > > +		memcpy(to->remote_ipv6, x->props.saddr.a6,
> > > +		       sizeof(to->remote_ipv6));
> > > +	} else {
> > > +		to->remote_ipv4 = be32_to_cpu(x->props.saddr.a4);
> > > +	}  
> > 
> > that looks inconsistent. Why v4 is cpu endian, but v6 not?
> 
> I agree. I followed the reference in bpf_skb_get_tunnel_key(). 
> I can keep v4 in net endianess too.

argh.
On one side it makes sense to be consistent with bpf_skb_get_tunnel_key()
but it's certainly confusing to have v4 and v6 in different endianness.
Imagine man page that says that bpf folks made a mistake in that
helper can kept repeating it in other helpers for consistency...
Daniel, what do you think?
Do you remember the history with bpf_skb_get_tunnel_key and
why it happened that way?

> > Why change endianness of the spi?
> 
> I felt it was more consistent with other fields and usually helpful for
> programs. I can keep it in network order.
> 
> In which case, do you expect it to be typed as __be32 in bpf.h?
> (I haven't seen other cases)?

It can be __u32 with a comment /* Stored in network byte order */
like in bunch of other fields.

^ permalink raw reply

* Re: [PATCH bpf-next v3 9/9] tools/bpf: add a test for bpf_get_stack with tracepoint prog
From: Alexei Starovoitov @ 2018-04-23  0:27 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, daniel, netdev, kernel-team
In-Reply-To: <20180420221842.742330-10-yhs@fb.com>

On Fri, Apr 20, 2018 at 03:18:42PM -0700, Yonghong Song wrote:
> The test_stacktrace_map and test_stacktrace_build_id are
> enhanced to call bpf_get_stack in the helper to get the
> stack trace as well.  The stack traces from bpf_get_stack
> and bpf_get_stackid are compared to ensure that for the
> same stack as represented as the same hash, their ip addresses
> or build id's must be the same.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  tools/testing/selftests/bpf/test_progs.c           | 63 +++++++++++++++++++---
>  .../selftests/bpf/test_stacktrace_build_id.c       | 20 ++++++-
>  tools/testing/selftests/bpf/test_stacktrace_map.c  | 20 +++++--
>  3 files changed, 92 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> index dad4c3f..06b922a 100644
> --- a/tools/testing/selftests/bpf/test_progs.c
> +++ b/tools/testing/selftests/bpf/test_progs.c
> @@ -897,11 +897,40 @@ static int compare_map_keys(int map1_fd, int map2_fd)
>  	return 0;
>  }
>  
> +static int compare_stack_ips(int smap_fd, int amap_fd, int stack_trace_len)
> +{
> +	__u32 key, next_key, *cur_key_p, *next_key_p;
> +	char val_buf1[stack_trace_len], val_buf2[stack_trace_len];

the kernel is trying to get rid of VLAs.
test_progs.c already uses them, but if possible let's not
add more uses of them.
Other than that looks great.

^ permalink raw reply

* Re: [PATCH bpf-next v3 8/9] tools/bpf: add a test for bpf_get_stack with raw tracepoint prog
From: Alexei Starovoitov @ 2018-04-23  0:23 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, daniel, netdev, kernel-team
In-Reply-To: <20180420221842.742330-9-yhs@fb.com>

On Fri, Apr 20, 2018 at 03:18:41PM -0700, Yonghong Song wrote:
> The test attached a raw_tracepoint program to sched/sched_switch.
> It tested to get stack for user space, kernel space and user
> space with build_id request. It also tested to get user
> and kernel stack into the same buffer with back-to-back
> bpf_get_stack helper calls.
> 
> Whenever the kernel stack is available, the user space
> application will check to ensure that the kernel function
> for raw_tracepoint ___bpf_prog_run is part of the stack.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  tools/testing/selftests/bpf/Makefile               |   3 +-
>  tools/testing/selftests/bpf/test_get_stack_rawtp.c | 102 ++++++++++++++++++
>  tools/testing/selftests/bpf/test_progs.c           | 115 +++++++++++++++++++++
>  3 files changed, 219 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/test_get_stack_rawtp.c
> 
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 0b72cc7..54e9e74 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -32,7 +32,7 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
>  	test_l4lb_noinline.o test_xdp_noinline.o test_stacktrace_map.o \
>  	sample_map_ret0.o test_tcpbpf_kern.o test_stacktrace_build_id.o \
>  	sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o \
> -	test_btf_haskv.o test_btf_nokv.o
> +	test_btf_haskv.o test_btf_nokv.o test_get_stack_rawtp.o
>  
>  # Order correspond to 'make run_tests' order
>  TEST_PROGS := test_kmod.sh \
> @@ -56,6 +56,7 @@ $(TEST_GEN_PROGS_EXTENDED): $(OUTPUT)/libbpf.a
>  $(OUTPUT)/test_dev_cgroup: cgroup_helpers.c
>  $(OUTPUT)/test_sock: cgroup_helpers.c
>  $(OUTPUT)/test_sock_addr: cgroup_helpers.c
> +$(OUTPUT)/test_progs: trace_helpers.c
>  
>  .PHONY: force
>  
> diff --git a/tools/testing/selftests/bpf/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/test_get_stack_rawtp.c
> new file mode 100644
> index 0000000..ba1dcf9
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/test_get_stack_rawtp.c
> @@ -0,0 +1,102 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include "bpf_helpers.h"
> +
> +/* Permit pretty deep stack traces */
> +#define MAX_STACK_RAWTP 100
> +struct stack_trace_t {
> +	int pid;
> +	int kern_stack_size;
> +	int user_stack_size;
> +	int user_stack_buildid_size;
> +	__u64 kern_stack[MAX_STACK_RAWTP];
> +	__u64 user_stack[MAX_STACK_RAWTP];
> +	struct bpf_stack_build_id user_stack_buildid[MAX_STACK_RAWTP];
> +};
> +
> +struct bpf_map_def SEC("maps") perfmap = {
> +	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
> +	.key_size = sizeof(int),
> +	.value_size = sizeof(__u32),
> +	.max_entries = 2,
> +};
> +
> +struct bpf_map_def SEC("maps") stackdata_map = {
> +	.type = BPF_MAP_TYPE_PERCPU_ARRAY,
> +	.key_size = sizeof(__u32),
> +	.value_size = sizeof(struct stack_trace_t),
> +	.max_entries = 1,
> +};
> +
> +/* Allocate per-cpu space twice the needed. For the code below
> + *   usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> + *   if (usize < 0)
> + *     return 0;
> + *   ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> + *
> + * If we have value_size = MAX_STACK_RAWTP * sizeof(__u64),
> + * verifier will complain that access "raw_data + usize"
> + * with size "max_len - usize" may be out of bound.
> + * The maximum "raw_data + usize" is "raw_data + max_len"
> + * and the maximum "max_len - usize" is "max_len", verifier
> + * concludes that the maximum buffer access range is
> + * "raw_data[0...max_len * 2 - 1]" and hence reject the program.
> + *
> + * Doubling the to-be-used max buffer size can fix this verifier
> + * issue and avoid complicated C programming massaging.
> + * This is an acceptable workaround since there is one entry here.
> + */
> +struct bpf_map_def SEC("maps") rawdata_map = {
> +	.type = BPF_MAP_TYPE_PERCPU_ARRAY,
> +	.key_size = sizeof(__u32),
> +	.value_size = MAX_STACK_RAWTP * sizeof(__u64) * 2,
> +	.max_entries = 1,
> +};
> +
> +SEC("tracepoint/sched/sched_switch")
> +int bpf_prog1(void *ctx)
> +{
> +	int max_len, max_buildid_len, usize, ksize, total_size;
> +	struct stack_trace_t *data;
> +	void *raw_data;
> +	__u32 key = 0;
> +
> +	data = bpf_map_lookup_elem(&stackdata_map, &key);
> +	if (!data)
> +		return 0;
> +
> +	max_len = MAX_STACK_RAWTP * sizeof(__u64);
> +	max_buildid_len = MAX_STACK_RAWTP * sizeof(struct bpf_stack_build_id);
> +	data->pid = bpf_get_current_pid_tgid();
> +	data->kern_stack_size = bpf_get_stack(ctx, data->kern_stack,
> +					      max_len, 0);
> +	data->user_stack_size = bpf_get_stack(ctx, data->user_stack, max_len,
> +					    BPF_F_USER_STACK);
> +	data->user_stack_buildid_size = bpf_get_stack(
> +		ctx, data->user_stack_buildid, max_buildid_len,
> +		BPF_F_USER_STACK | BPF_F_USER_BUILD_ID);
> +	bpf_perf_event_output(ctx, &perfmap, 0, data, sizeof(*data));
> +
> +	/* write both kernel and user stacks to the same buffer */
> +	raw_data = bpf_map_lookup_elem(&rawdata_map, &key);
> +	if (!raw_data)
> +		return 0;
> +
> +	usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> +	if (usize < 0)
> +		return 0;
> +
> +	ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> +	if (ksize < 0)

may be instead of teaching verifier about ARSH (which doesn't look
straighforward) such use case can be done as:
u32 max_len, usize, ksize;
ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
if ((int)ksize < 0)

That's certainly suboptimal and very much non obvious to program
developers, but at least it can unblock the bpf_get_stack part
landing and proper ARSH support can be added later?
Just a thought.

> +		return 0;
> +
> +	total_size = usize + ksize;
> +	if (total_size > 0 && total_size <= max_len)
> +		bpf_perf_event_output(ctx, &perfmap, 0, raw_data, total_size);
> +
> +	return 0;
> +}

the rest of the test looks great. Thank you for adding such exhaustive test.

^ permalink raw reply

* Re: [PATCH bpf-next v3 6/9] samples/bpf: move common-purpose trace functions to selftests
From: Alexei Starovoitov @ 2018-04-23  0:17 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, daniel, netdev, kernel-team
In-Reply-To: <20180420221842.742330-7-yhs@fb.com>

On Fri, Apr 20, 2018 at 03:18:39PM -0700, Yonghong Song wrote:
> There is no functionality change in this patch. The common-purpose
> trace functions, including perf_event polling and ksym lookup,
> are moved from trace_output_user.c and bpf_load.c to
> selftests/bpf/trace_helpers.c so that these function can
> be reused later in selftests.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply

* Re: [PATCH bpf-next v3 4/9] bpf/verifier: improve register value range tracking with ARSH
From: Alexei Starovoitov @ 2018-04-23  0:16 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, daniel, netdev, kernel-team
In-Reply-To: <20180420221842.742330-5-yhs@fb.com>

On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote:
> When helpers like bpf_get_stack returns an int value
> and later on used for arithmetic computation, the LSH and ARSH
> operations are often required to get proper sign extension into
> 64-bit. For example, without this patch:
>     54: R0=inv(id=0,umax_value=800)
>     54: (bf) r8 = r0
>     55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800)
>     55: (67) r8 <<= 32
>     56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000))
>     56: (c7) r8 s>>= 32
>     57: R8=inv(id=0)
> With this patch:
>     54: R0=inv(id=0,umax_value=800)
>     54: (bf) r8 = r0
>     55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800)
>     55: (67) r8 <<= 32
>     56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000))
>     56: (c7) r8 s>>= 32
>     57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff))
> With better range of "R8", later on when "R8" is added to other register,
> e.g., a map pointer or scalar-value register, the better register
> range can be derived and verifier failure may be avoided.
> 
> In our later example,
>     ......
>     usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
>     if (usize < 0)
>         return 0;
>     ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
>     ......
> Without improving ARSH value range tracking, the register representing
> "max_len - usize" will have smin_value equal to S64_MIN and will be
> rejected by verifier.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3c8bb92..01c215d 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
>  		/* We may learn something more from the var_off */
>  		__update_reg_bounds(dst_reg);
>  		break;
> +	case BPF_ARSH:
> +		if (umax_val >= insn_bitness) {
> +			/* Shifts greater than 31 or 63 are undefined.
> +			 * This includes shifts by a negative number.
> +			 */
> +			mark_reg_unknown(env, regs, insn->dst_reg);
> +			break;
> +		}
> +		if (dst_reg->smin_value < 0)
> +			dst_reg->smin_value >>= umin_val;
> +		else
> +			dst_reg->smin_value >>= umax_val;
> +		if (dst_reg->smax_value < 0)
> +			dst_reg->smax_value >>= umax_val;
> +		else
> +			dst_reg->smax_value >>= umin_val;
> +		if (src_known)
> +			dst_reg->var_off = tnum_rshift(dst_reg->var_off,
> +						       umin_val);
> +		else
> +			dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val);
> +		dst_reg->umin_value >>= umax_val;
> +		dst_reg->umax_value >>= umin_val;
> +		/* We may learn something more from the var_off */
> +		__update_reg_bounds(dst_reg);

I'm struggling to understand how these bounds are computed.
Could you add examples in the comments?
In particular if dst_reg is unknown (tnum.mask == -1)
the above tnum_rshift() will clear upper bits and will make it
64-bit positive, but that doesn't seem correct.
What am I missing?

^ permalink raw reply

* Re: [PATCH bpf-next v3 3/9] bpf/verifier: refine retval R0 state for bpf_get_stack helper
From: Alexei Starovoitov @ 2018-04-22 23:55 UTC (permalink / raw)
  To: Yonghong Song; +Cc: ast, daniel, netdev, kernel-team
In-Reply-To: <20180420221842.742330-4-yhs@fb.com>

On Fri, Apr 20, 2018 at 03:18:36PM -0700, Yonghong Song wrote:
> The special property of return values for helpers bpf_get_stack
> and bpf_probe_read_str are captured in verifier.
> Both helpers return a negative error code or
> a length, which is equal to or smaller than the buffer
> size argument. This additional information in the
> verifier can avoid the condition such as "retval > bufsize"
> in the bpf program. For example, for the code blow,
>     usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
>     if (usize < 0 || usize > max_len)
>         return 0;
> The verifier may have the following errors:
>     52: (85) call bpf_get_stack#65
>      R0=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R1_w=ctx(id=0,off=0,imm=0)
>      R2_w=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R3_w=inv800 R4_w=inv256
>      R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
>      R9_w=inv800 R10=fp0,call_-1
>     53: (bf) r8 = r0
>     54: (bf) r1 = r8
>     55: (67) r1 <<= 32
>     56: (bf) r2 = r1
>     57: (77) r2 >>= 32
>     58: (25) if r2 > 0x31f goto pc+33
>      R0=inv(id=0) R1=inv(id=0,smax_value=9223372032559808512,
>                          umax_value=18446744069414584320,
>                          var_off=(0x0; 0xffffffff00000000))
>      R2=inv(id=0,umax_value=799,var_off=(0x0; 0x3ff))
>      R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
>      R8=inv(id=0) R9=inv800 R10=fp0,call_-1
>     59: (1f) r9 -= r8
>     60: (c7) r1 s>>= 32
>     61: (bf) r2 = r7
>     62: (0f) r2 += r1
>     math between map_value pointer and register with unbounded
>     min value is not allowed
> The failure is due to llvm compiler optimization where register "r2",
> which is a copy of "r1", is tested for condition while later on "r1"
> is used for map_ptr operation. The verifier is not able to track such
> inst sequence effectively.
> 
> Without the "usize > max_len" condition, there is no llvm optimization
> and the below generated code passed verifier:
>     52: (85) call bpf_get_stack#65
>      R0=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R1_w=ctx(id=0,off=0,imm=0)
>      R2_w=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R3_w=inv800 R4_w=inv256
>      R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
>      R9_w=inv800 R10=fp0,call_-1
>     53: (b7) r1 = 0
>     54: (bf) r8 = r0
>     55: (67) r8 <<= 32
>     56: (c7) r8 s>>= 32
>     57: (6d) if r1 s> r8 goto pc+24
>      R0=inv(id=0,umax_value=800) R1=inv0 R6=ctx(id=0,off=0,imm=0)
>      R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0)
>      R8=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R9=inv800
>      R10=fp0,call_-1
>     58: (bf) r2 = r7
>     59: (0f) r2 += r8
>     60: (1f) r9 -= r8
>     61: (bf) r1 = r6
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  kernel/bpf/verifier.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index aba9425..3c8bb92 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -164,6 +164,8 @@ struct bpf_call_arg_meta {
>  	bool pkt_access;
>  	int regno;
>  	int access_size;
> +	s64 msize_smax_value;
> +	u64 msize_umax_value;
>  };
>  
>  static DEFINE_MUTEX(bpf_verifier_lock);
> @@ -2027,6 +2029,14 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
>  		err = check_helper_mem_access(env, regno - 1,
>  					      reg->umax_value,
>  					      zero_size_allowed, meta);
> +
> +		if (!err && !!meta) {

Please drop !! in the above.

Also what happens when
if (!tnum_is_const(reg->var_off))
  meta = NULL;
?
it seems two new fields of meta will stay zero initialized
that later do_refine_retval_range() will set R0->umax_value = 0
which seems incorrect.

> +			/* remember the mem_size which may be used later
> +			 * to refine return values.
> +			 */
> +			meta->msize_smax_value = reg->smax_value;
> +			meta->msize_umax_value = reg->umax_value;
> +		}
>  	}
>  
>  	return err;
> @@ -2333,6 +2343,21 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
>  	return 0;
>  }
>  
> +static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
> +				   int func_id,
> +				   struct bpf_call_arg_meta *meta)
> +{
> +	struct bpf_reg_state *ret_reg = &regs[BPF_REG_0];
> +
> +	if (ret_type != RET_INTEGER ||
> +	    (func_id != BPF_FUNC_get_stack &&
> +	     func_id != BPF_FUNC_probe_read_str))
> +		return;
> +
> +	ret_reg->smax_value = meta->msize_smax_value;
> +	ret_reg->umax_value = meta->msize_umax_value;
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn_idx)
>  {
>  	const struct bpf_func_proto *fn = NULL;
> @@ -2456,6 +2481,8 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>  		return -EINVAL;
>  	}
>  
> +	do_refine_retval_range(regs, fn->ret_type, func_id, &meta);
> +
>  	err = check_map_func_compatibility(env, meta.map_ptr, func_id);
>  	if (err)
>  		return err;
> -- 
> 2.9.5
> 

^ permalink raw reply

* XDP breakage with virtio due to 6870de435b90c083ae0f3f7f341287976ef56f03
From: David Ahern @ 2018-04-22 22:47 UTC (permalink / raw)
  To: tehnerd, Jason Wang, daniel; +Cc: netdev@vger.kernel.org

This commit breaks my FIB forwarding program:

commit 6870de435b90c083ae0f3f7f341287976ef56f03
Author: Nikita V. Shirokov <tehnerd@tehnerd.com>
Date:   Tue Apr 17 21:42:20 2018 -0700

    bpf: make virtio compatible w/ bpf_xdp_adjust_tail

    w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
    well (only "decrease" of pointer's location is going to be supported).
    changing of this pointer will change packet's size.
    for virtio driver we need to adjust XDP_PASS handling by recalculating
    length of the packet if it was passed to the TCP/IP stack

    Reviewed-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

###

Some of the packets (e.g., ARP or those without a resolved neighbor) are
passed to the networking stack. What shows up are clearly broken packets:

# tcpdump -n -i eth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
15:45:29.693238 [|ARP]
	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
15:45:30.710327 [|ARP]
	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
15:45:31.734296 [|ARP]
	0x0000:  0001 0800 0604 0001 42c0 ce2f 3fa9 0a64  ........B../?..d
15:45:32.908720 IP6 truncated-ip6 - 12 bytes
missing!fe80::40c0:ceff:fe2f:3fa9 > ff02::1:ff00:2: ICMP6, neighbor
solicitation[|icmp6]
15:45:33.910530 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]
15:45:34.934437 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]
15:45:35.958394 IP6 truncated-ip6 - 12 bytes missing!2001:db8:1::64 >
ff02::1:ff00:2: ICMP6, neighbor solicitation[|icmp6]

Reverting the mentioned patch fixes the problem.

^ permalink raw reply

* [PATCH bpf-next v4 1/2] bpf: allow map helpers access to map values directly
From: Paul Chaignon @ 2018-04-22 21:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev; +Cc: iovisor-dev, paul.chaignon
In-Reply-To: <cover.1524407664.git.paul.chaignon@orange.com>

Helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only
access stack and packet memory.  Allow these helpers to directly access
map values by passing registers of type PTR_TO_MAP_VALUE.

This change removes the need for an extra copy to the stack when using a
map value to perform a second map lookup, as in the following:

struct bpf_map_def SEC("maps") infobyreq = {
    .type = BPF_MAP_TYPE_HASHMAP,
    .key_size = sizeof(struct request *),
    .value_size = sizeof(struct info_t),
    .max_entries = 1024,
};
struct bpf_map_def SEC("maps") counts = {
    .type = BPF_MAP_TYPE_HASHMAP,
    .key_size = sizeof(struct info_t),
    .value_size = sizeof(u64),
    .max_entries = 1024,
};
SEC("kprobe/blk_account_io_start")
int bpf_blk_account_io_start(struct pt_regs *ctx)
{
    struct info_t *info = bpf_map_lookup_elem(&infobyreq, &ctx->di);
    u64 *count = bpf_map_lookup_elem(&counts, info);
    (*count)++;
}

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
---
 kernel/bpf/verifier.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5dd1dcb902bf..70e00beade03 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1914,7 +1914,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 	if (arg_type == ARG_PTR_TO_MAP_KEY ||
 	    arg_type == ARG_PTR_TO_MAP_VALUE) {
 		expected_type = PTR_TO_STACK;
-		if (!type_is_pkt_pointer(type) &&
+		if (!type_is_pkt_pointer(type) && type != PTR_TO_MAP_VALUE &&
 		    type != expected_type)
 			goto err_type;
 	} else if (arg_type == ARG_CONST_SIZE ||
@@ -1970,6 +1970,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 			err = check_packet_access(env, regno, reg->off,
 						  meta->map_ptr->key_size,
 						  false);
+		else if (type == PTR_TO_MAP_VALUE)
+			err = check_map_access(env, regno, reg->off,
+					       meta->map_ptr->key_size, false);
 		else
 			err = check_stack_boundary(env, regno,
 						   meta->map_ptr->key_size,
@@ -1987,6 +1990,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 			err = check_packet_access(env, regno, reg->off,
 						  meta->map_ptr->value_size,
 						  false);
+		else if (type == PTR_TO_MAP_VALUE)
+			err = check_map_access(env, regno, reg->off,
+					       meta->map_ptr->value_size,
+					       false);
 		else
 			err = check_stack_boundary(env, regno,
 						   meta->map_ptr->value_size,
-- 
2.14.1

^ permalink raw reply related

* [PATCH bpf-next v4 2/2] tools/bpf: add verifier tests for accesses to map
From: Paul Chaignon via iovisor-dev @ 2018-04-22 21:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann,
	netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: iovisor-dev-9jONkmmOlFHEE9lA1F8Ukti2O/JbrIOy
In-Reply-To: <cover.1524407664.git.paul.chaignon-C0LM0jrOve7QT0dZR+AlfA@public.gmane.org>

This patch adds new test cases for accesses to map values from map
helpers.

Signed-off-by: Paul Chaignon <paul.chaignon-C0LM0jrOve7QT0dZR+AlfA@public.gmane.org>
---
 tools/testing/selftests/bpf/test_verifier.c | 266 ++++++++++++++++++++++++++++
 1 file changed, 266 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 3e7718b1a9ae..165e9ddfa446 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -64,6 +64,7 @@ struct bpf_test {
 	struct bpf_insn	insns[MAX_INSNS];
 	int fixup_map1[MAX_FIXUPS];
 	int fixup_map2[MAX_FIXUPS];
+	int fixup_map3[MAX_FIXUPS];
 	int fixup_prog[MAX_FIXUPS];
 	int fixup_map_in_map[MAX_FIXUPS];
 	const char *errstr;
@@ -88,6 +89,11 @@ struct test_val {
 	int foo[MAX_ENTRIES];
 };
 
+struct other_val {
+	long long foo;
+	long long bar;
+};
+
 static struct bpf_test tests[] = {
 	{
 		"add+sub+mul",
@@ -5593,6 +5599,257 @@ static struct bpf_test tests[] = {
 		.errstr = "R1 min value is negative",
 		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
 	},
+	{
+		"map lookup helper access to map",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 8 },
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map update helper access to map",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_IMM(BPF_REG_4, 0),
+			BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_update_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 10 },
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map update helper access to map: wrong size",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_IMM(BPF_REG_4, 0),
+			BPF_MOV64_REG(BPF_REG_3, BPF_REG_0),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_update_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map1 = { 3 },
+		.fixup_map3 = { 10 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=8 off=0 size=16",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const imm)",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2,
+				      offsetof(struct other_val, bar)),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 9 },
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const imm): out-of-bound 1",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2,
+				      sizeof(struct other_val) - 4),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 9 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=16 off=12 size=8",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const imm): out-of-bound 2",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 9 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=16 off=-4 size=8",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const reg)",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_MOV64_IMM(BPF_REG_3,
+				      offsetof(struct other_val, bar)),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 10 },
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const reg): out-of-bound 1",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_MOV64_IMM(BPF_REG_3,
+				      sizeof(struct other_val) - 4),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 10 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=16 off=12 size=8",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via const reg): out-of-bound 2",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_MOV64_IMM(BPF_REG_3, -4),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 10 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=16 off=-4 size=8",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via variable)",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JGT, BPF_REG_3,
+				    offsetof(struct other_val, bar), 4),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 11 },
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via variable): no max check",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_0, 0),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 10 },
+		.result = REJECT,
+		.errstr = "R2 unbounded memory access, make sure to bounds check any array access into a map",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"map helper access to adjusted map (via variable): wrong max check",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7),
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JGT, BPF_REG_3,
+				    offsetof(struct other_val, bar) + 1, 4),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map3 = { 3, 11 },
+		.result = REJECT,
+		.errstr = "invalid access to map value, value_size=16 off=9 size=8",
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
 	{
 		"map element value is preserved across register spilling",
 		.insns = {
@@ -11533,6 +11790,7 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
 {
 	int *fixup_map1 = test->fixup_map1;
 	int *fixup_map2 = test->fixup_map2;
+	int *fixup_map3 = test->fixup_map3;
 	int *fixup_prog = test->fixup_prog;
 	int *fixup_map_in_map = test->fixup_map_in_map;
 
@@ -11556,6 +11814,14 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
 		} while (*fixup_map2);
 	}
 
+	if (*fixup_map3) {
+		map_fds[1] = create_map(sizeof(struct other_val), 1);
+		do {
+			prog[*fixup_map3].imm = map_fds[1];
+			fixup_map3++;
+		} while (*fixup_map3);
+	}
+
 	if (*fixup_prog) {
 		map_fds[2] = create_prog_array();
 		do {
-- 
2.14.1

^ permalink raw reply related

* [PATCH bpf-next v4 0/2] bpf: allow map helpers access to map values directly
From: Paul Chaignon via iovisor-dev @ 2018-04-22 21:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann,
	netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: iovisor-dev-9jONkmmOlFHEE9lA1F8Ukti2O/JbrIOy

Currently, helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE
can only access stack and packet memory.  This patchset allows these
helpers to directly access map values by passing registers of type
PTR_TO_MAP_VALUE.

The first patch changes the verifier; the second adds new test cases.

Previous versions of this patchset were sent on the iovisor-dev mailing
list only.

Changelogs:
  Changes in v4:
    - Rebase.
  Changes in v3:
    - Bug fixes.
    - Negative test cases.
  Changes in v2:
    - Additional test cases for adjusted maps.

Paul Chaignon (2):
  bpf: allow map helpers access to map values directly
  tools/bpf: add verifier tests for accesses to map

 kernel/bpf/verifier.c                       |   9 +-
 tools/testing/selftests/bpf/test_verifier.c | 266 ++++++++++++++++++++++++++++
 2 files changed, 274 insertions(+), 1 deletion(-)

-- 
2.14.1

^ permalink raw reply

* kTLS in combination with mlx4 is very unstable
From: Andre Tomt @ 2018-04-22 21:21 UTC (permalink / raw)
  To: netdev; +Cc: davejwatson, Tariq Toukan

Hello!

kTLS looks fun, so I decided to play with it. It is quite spiffy - 
however with mlx4 I get kernel crashes I'm not seeing when testing on ixgbe.

For testing I'm using a git build of the "stream reflector" cubemap[1] 
configured with kTLS and 8 worker threads running on 4 physical cores, 
loading it up with a ~13Mbps MPEG-TS stream pulled from satelite TV.

The kernel seems to get increasingly unstable as I load it up with 
client connections. At about 9Gbps and 700 connections, it is okay at 
least for a while - it might run fine for say 45 minutes. Once it gets 
to 20 - 30Gbps, the kernel will usually start spewing OOPSes within 
minutes and the traffic drops.

Some bad interaction between mlx4 and kTLS?

Hardware is a quad core Xeon-D 1520 using a dual port Mellanox 
ConnectX-3 VPI with a single 40Gbps ethernet link configured. Mellanox 
mlx4 driver settings are kernel.org upstream defaults. Interface is 
configured with FQ qdisc and sockets are using BBR congestion control.

Tested on kernel 4.14.34, 4.15.17, and 4.16.2 - 4.16.3.

[1] https://git.sesse.net/?p=cubemap

First OOPS (from 4.16.3):
> [  660.467358] BUG: stack guard page was hit at 00000000b136e403 (stack is 00000000ded3f179..00000000835ee6c5)
> [  660.467422] kernel stack overflow (double-fault): 0000 [#1] SMP PTI
> [  660.467457] Modules linked in: coretemp intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp iTCO_wdt gpio_ich iTCO_vendor_support kvm_intel mxm_wmi xfs libcrc32c kvm crc32c_generic irqbypass nls_iso8859_1 crct10dif_pclmul crc32_pclmul nls_cp437 ghash_clmulni_intel vfat fat aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_pch_thermal mei_me sg mei lpc_ich mfd_core evdev ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_pad tls ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 hid_generic usbhid hid mlx4_ib mlx4_en ib_core sd_mod ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect ehci_pci xhci_pci sysimgblt fb_sys_fops ahci libahci xhci_hcd ehci_hcd libata crc32c_intel nvme drm usbcore scsi_mod mlx4_core ixgbe i2c_core mdio usb_common devlink hwmon nvme_core rtc_cm
 os
> [  660.467856] CPU: 4 PID: 660 Comm: cubemap Not tainted 4.16.0-1 #1
> [  660.467890] Hardware name: Supermicro Super Server/X10SDV-4C-TLN2F, BIOS 1.2c 09/19/2017
> [  660.467939] RIP: 0010:__kmalloc+0x7/0x1f0
> [  660.467962] RSP: 0018:ffffabafc27b8000 EFLAGS: 00010206
> [  660.467992] RAX: 000000000000000d RBX: 0000000000000010 RCX: ffffabafc27b8070
> [  660.468030] RDX: ffff98a0d0235490 RSI: 0000000001080020 RDI: 000000000000001d
> [  660.468069] RBP: 000000000000000d R08: ffff98a0d5be4860 R09: ffff98a0ec299180
> [  660.468106] R10: ffffabafc27b80b8 R11: 0000000000000010 R12: 0000000000000010
> [  660.468145] R13: ffff98a0ec299180 R14: ffff98a0ec299180 R15: 0000000000000000
> [  660.468184] FS:  00007f8a35ffb700(0000) GS:ffff98a17fd00000(0000) knlGS:0000000000000000
> [  660.468227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  660.468258] CR2: ffffabafc27b7ff8 CR3: 00000004698ee001 CR4: 00000000003606e0
> [  660.468297] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  660.468334] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  660.468373] Call Trace:
> [  660.468401]  gcmaes_encrypt.constprop.5+0x137/0x240 [aesni_intel]
> [  660.468439]  ? generic_gcmaes_encrypt+0x5f/0x80 [aesni_intel]
> [  660.468476]  ? gcmaes_wrapper_encrypt+0x36/0x80 [aesni_intel]
> [  660.468511]  ? tls_push_record+0x1d3/0x390 [tls]
> [  660.468537]  ? tls_push_record+0x1d3/0x390 [tls]
> [  660.468565]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.468593]  ? do_tcp_sendpages+0x8d/0x580
> [  660.468618]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.468643]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.468671]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.468697]  ? do_tcp_sendpages+0x8d/0x580
> [  660.468722]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.468748]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.468776]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.468802]  ? do_tcp_sendpages+0x8d/0x580
> [  660.468826]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.468852]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.468880]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.468906]  ? do_tcp_sendpages+0x8d/0x580
> [  660.468931]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.468957]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.470165]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.471363]  ? do_tcp_sendpages+0x8d/0x580
> [  660.472555]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.473713]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.474838]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.475927]  ? do_tcp_sendpages+0x8d/0x580
> [  660.476977]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.477999]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.478968]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.479902]  ? do_tcp_sendpages+0x8d/0x580
> [  660.480790]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.481644]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.482483]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.483301]  ? do_tcp_sendpages+0x8d/0x580
> [  660.484099]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.484891]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.485674]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.486455]  ? do_tcp_sendpages+0x8d/0x580
> [  660.487220]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.487890]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.488328]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.488748]  ? do_tcp_sendpages+0x8d/0x580
> [  660.489167]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.489565]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.489970]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.490370]  ? do_tcp_sendpages+0x8d/0x580
> [  660.490771]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.491165]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.491550]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.491914]  ? do_tcp_sendpages+0x8d/0x580
> [  660.492274]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.492641]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.493008]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.493374]  ? do_tcp_sendpages+0x8d/0x580
> [  660.493787]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.494177]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.494585]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.494972]  ? do_tcp_sendpages+0x8d/0x580
> [  660.495359]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.495742]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.496128]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.496512]  ? do_tcp_sendpages+0x8d/0x580
> [  660.496901]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.497301]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.497697]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.498096]  ? do_tcp_sendpages+0x8d/0x580
> [  660.498490]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.498884]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.499291]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.499700]  ? do_tcp_sendpages+0x8d/0x580
> [  660.500103]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.500511]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.500909]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.501326]  ? do_tcp_sendpages+0x8d/0x580
> [  660.501737]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.502131]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.502525]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.502928]  ? do_tcp_sendpages+0x8d/0x580
> [  660.503331]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.503724]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.504127]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.504547]  ? do_tcp_sendpages+0x8d/0x580
> [  660.504949]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.505348]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.505769]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.506207]  ? do_tcp_sendpages+0x8d/0x580
> [  660.506622]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.507030]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.507435]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.507841]  ? do_tcp_sendpages+0x8d/0x580
> [  660.508518]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.509261]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.510011]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.510187] BUG: stack guard page was hit at 00000000e0315e51 (stack is 00000000bea6f919..0000000005fc5eb4)
> [  660.510473] BUG: stack guard page was hit at 000000004b958a15 (stack is 000000001f2af2d1..000000006295a4b1)
> [  660.510758]  ? do_tcp_sendpages+0x8d/0x580
> [  660.513094]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.513886]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.514680]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.515487]  ? do_tcp_sendpages+0x8d/0x580
> [  660.515750] BUG: stack guard page was hit at 00000000bc93cf0d (stack is 0000000031a15c9c..0000000029a82776)
> [  660.516295]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.518017]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.518883]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.519752]  ? do_tcp_sendpages+0x8d/0x580
> [  660.519816] BUG: stack guard page was hit at 000000002d1db286 (stack is 00000000b5bb06d4..000000007a29c8f2)
> [  660.520544]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.522315]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.523162]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.524006]  ? do_tcp_sendpages+0x8d/0x580
> [  660.524849]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.525695]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.526545]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.527399]  ? do_tcp_sendpages+0x8d/0x580
> [  660.528247]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.529099]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.529955]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.530797]  ? do_tcp_sendpages+0x8d/0x580
> [  660.531643]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.532010] BUG: stack guard page was hit at 0000000027abda92 (stack is 00000000aadcb221..00000000a587b67b)
> [  660.532535]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.534511]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.535506]  ? do_tcp_sendpages+0x8d/0x580
> [  660.536500]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.537495]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.538493]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.539482]  ? do_tcp_sendpages+0x8d/0x580
> [  660.540462]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.541447]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.542430]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.543411]  ? do_tcp_sendpages+0x8d/0x580
> [  660.544395]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.545382]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.546365]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.547347]  ? do_tcp_sendpages+0x8d/0x580
> [  660.548334]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.549318]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.550300]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.551284]  ? do_tcp_sendpages+0x8d/0x580
> [  660.552267]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.553250]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.554205]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.555158]  ? do_tcp_sendpages+0x8d/0x580
> [  660.556083]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.557009]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.557936]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.558862]  ? do_tcp_sendpages+0x8d/0x580
> [  660.559786]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.560681]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.561547]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.562413]  ? do_tcp_sendpages+0x8d/0x580
> [  660.563279]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.564143]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.564979]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.565783]  ? do_tcp_sendpages+0x8d/0x580
> [  660.566587]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.567392]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.568197]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.569000]  ? do_tcp_sendpages+0x8d/0x580
> [  660.569804]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.570609]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.571415]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.572218]  ? do_tcp_sendpages+0x8d/0x580
> [  660.573023]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.573830]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.574634]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.575437]  ? do_tcp_sendpages+0x8d/0x580
> [  660.576210]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.576953]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.577698]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.578441]  ? do_tcp_sendpages+0x8d/0x580
> [  660.579183]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.579929]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.580673]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.581417]  ? do_tcp_sendpages+0x8d/0x580
> [  660.582159]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.582904]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.583649]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.584394]  ? do_tcp_sendpages+0x8d/0x580
> [  660.585137]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.585882]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.586628]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.587372]  ? do_tcp_sendpages+0x8d/0x580
> [  660.588115]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.588861]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.589605]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.590350]  ? do_tcp_sendpages+0x8d/0x580
> [  660.591093]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.591840]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.592585]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.593328]  ? do_tcp_sendpages+0x8d/0x580
> [  660.594072]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.594816]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.595563]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.596308]  ? do_tcp_sendpages+0x8d/0x580
> [  660.597050]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.597794]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.598540]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.599281]  ? do_tcp_sendpages+0x8d/0x580
> [  660.600025]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.600772]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.601517]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.602260]  ? do_tcp_sendpages+0x8d/0x580
> [  660.603003]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.603750]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.604495]  ? tls_write_space+0x6a/0x80 [tls]
> [  660.605238]  ? do_tcp_sendpages+0x8d/0x580
> [  660.605981]  ? tls_push_sg+0x74/0x130 [tls]
> [  660.606726]  ? tls_push_record+0x24a/0x390 [tls]
> [  660.607474]  ? tls_sw_sendpage+0x14a/0x390 [tls]
> [  660.608214]  ? direct_splice_actor+0x40/0x40
> [  660.608951]  ? inet_sendpage+0x40/0xf0
> [  660.609689]  ? kernel_sendpage+0x1a/0x30
> [  660.610426]  ? sock_sendpage+0x20/0x30
> [  660.611161]  ? pipe_to_sendpage+0x5f/0x70
> [  660.611898]  ? __splice_from_pipe+0x80/0x180
> [  660.612637]  ? generic_file_splice_read+0x100/0x150
> [  660.613382]  ? direct_splice_actor+0x40/0x40
> [  660.614128]  ? splice_from_pipe+0x4f/0x70
> [  660.614871]  ? direct_splice_actor+0x35/0x40
> [  660.615619]  ? splice_direct_to_actor+0xce/0x1d0
> [  660.616368]  ? generic_pipe_buf_nosteal+0x10/0x10
> [  660.617122]  ? do_splice_direct+0x8c/0xa0
> [  660.617876]  ? do_sendfile+0x19d/0x380
> [  660.618626]  ? SyS_sendfile64+0x4c/0x90
> [  660.619376]  ? do_syscall_64+0x7a/0x390
> [  660.620121]  ? do_page_fault+0x31/0x130
> [  660.620863]  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [  660.621618] Code: 24 08 4c 89 e9 48 89 de e8 d7 66 63 00 4d 8b 17 5a 4d 85 d2 75 d7 e9 e9 fe ff ff 48 89 c5 e9 e1 fe ff ff 90 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 53 48 81 ff 00 20 00 00 0f 87 a4 01 00 00 
> [  660.623316] RIP: __kmalloc+0x7/0x1f0 RSP: ffffabafc27b8000
> [  660.624168] ---[ end trace 7f6206177c0cc58f ]---

^ permalink raw reply

* Re: [PATCH 1/3] ethtool: Support ETHTOOL_GSTATS2 command.
From: Roopa Prabhu @ 2018-04-22 21:15 UTC (permalink / raw)
  To: David Miller
  Cc: Johannes Berg, Ben Greear, netdev,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <20180422.145420.1197041027922699603.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Sun, Apr 22, 2018 at 11:54 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
> Date: Thu, 19 Apr 2018 17:26:57 +0200
>
>> On Thu, 2018-04-19 at 08:25 -0700, Ben Greear wrote:
>>>
>>> Maybe this could be in followup patches?  It's going to touch a lot of files,
>>> and might be hell to get merged all at once, and I've never used spatch, so
>>> just maybe someone else will volunteer that part :)
>>
>> I guess you'll have to ask davem. :)
>
> Well, first of all, I really don't like this.
>
> The first reason is that every time I see interface foo become foo2,
> foo3 is never far behind it.
>
> If foo was not extensible enough such that we needed foo2, we beter
> design the new thing with explicitly better extensibility in mind.
>
> Furthermore, what you want here is a specific filter.  Someone else
> will want to filter on another criteria, and the next person will
> want yet another.
>
> This needs to be properly generalized.
>
> And frankly if we had moved to ethtool netlink/devlink by now, we
> could just add a netlink attribute for filtering and not even be
> having this conversation.


+1.

Also, the RTM_GETSTATS api was added to improve stats query efficiency
(with filters).
 we should look at it  to see if this fits there. Keeping all stats
queries in one place will help.

^ permalink raw reply

* Re: linux-next on x60: network manager often complains "network is disabled" after resume
From: Pavel Machek @ 2018-04-22 19:22 UTC (permalink / raw)
  To: Woody Suwalski
  Cc: Rafael J. Wysocki, kernel list, Linux-pm mailing list,
	Netdev list
In-Reply-To: <01331018-8c62-30b2-673c-11b0bcd38c0e@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1046 bytes --]

Hi!

> >I guess you can't reproduce it easily? I tried bisecting, but while it
> >happens often enough to make v4.17 hard to use, it does not permit
> >reliable bisect.
> >
> >These should be bad according to my notes
> >
> >b04240a33b99b32cf6fbdf5c943c04e505a0cb07
> >  ed80dc19e4dd395c951f745acd1484d61c4cfb20
> >  52113a0d3889d6e2738cf09bf79bc9cac7b5e1c6
> >  4fc97ef94bbfa185d16b3e44199b7559d0668747
> >  14ebdb2c814f508936fe178a2abc906a16a3ab48
> >  639adbeef5ae1bb8eeebbb0cde0b885397bde192
> >
> >bisection claimed
> >
> >c16add24522547bf52c189b3c0d1ab6f5c2b4375
> >
> >is first bad commit, but I'm not sure if I trust that.
> >									Pavel
> It has not happen on any of my systems in the last month. Good, but bad for
> getting more info :-(

My current theory is that it only happens if you suspend your machine
for > ten minutes, or something like that.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [Patch net] llc: fix NULL pointer deref for SOCK_ZAPPED
From: David Miller @ 2018-04-22 18:56 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev
In-Reply-To: <20180420045434.21477-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 19 Apr 2018 21:54:34 -0700

> For SOCK_ZAPPED socket, we don't need to care about llc->sap,
> so we should just skip these refcount functions in this case.
> 
> Fixes: f7e43672683b ("llc: hold llc_sap before release_sock()")
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable, thanks Cong.

^ permalink raw reply

* Re: [PATCH v2] net: ethernet: ti: cpsw: fix tx vlan priority mapping
From: David Miller @ 2018-04-22 18:56 UTC (permalink / raw)
  To: ivan.khoronzhuk; +Cc: grygorii.strashko, linux-omap, netdev, linux-kernel
In-Reply-To: <1524167349-11004-1-git-send-email-ivan.khoronzhuk@linaro.org>

From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Date: Thu, 19 Apr 2018 22:49:09 +0300

> The CPDMA_TX_PRIORITY_MAP in real is vlan pcp field priority mapping
> register and basically replaces vlan pcp field for tagged packets.
> So, set it to be 1:1 mapping. Otherwise, it will cause unexpected
> change of egress vlan tagged packets, like prio 2 -> prio 5.
> 
> Fixes: e05107e6b747 ("net: ethernet: ti: cpsw: add multi queue support")
> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---

Applied and queud up for -stable.

^ permalink raw reply

* Re: [Patch net] llc: delete timers synchronously in llc_sk_free()
From: David Miller @ 2018-04-22 18:55 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev
In-Reply-To: <20180419192538.3362-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 19 Apr 2018 12:25:38 -0700

> The connection timers of an llc sock could be still flying
> after we delete them in llc_sk_free(), and even possibly
> after we free the sock. We could just wait synchronously
> here in case of troubles.
> 
> Note, I leave other call paths as they are, since they may
> not have to wait, at least we can change them to synchronously
> when needed.
> 
> Also, move the code to net/llc/llc_conn.c, which is apparently
> a better place.
> 
> Reported-by: <syzbot+f922284c18ea23a8e457@syzkaller.appspotmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox