* [PATCH net] rtnetlink: Disallow FDB configuration for non-Ethernet device
From: Ido Schimmel @ 2018-10-27 20:39 UTC (permalink / raw)
To: netdev@vger.kernel.org; +Cc: davem@davemloft.net, Ido Schimmel, Vlad Yasevich
When an FDB entry is configured, the address is validated to have the
length of an Ethernet address, but the device for which the address is
configured can be of any type.
The above can result in the use of uninitialized memory when the address
is later compared against existing addresses since 'dev->addr_len' is
used and it may be greater than ETH_ALEN, as with ip6tnl devices.
Fix this by making sure that FDB entries are only configured for
Ethernet devices.
BUG: KMSAN: uninit-value in memcmp+0x11d/0x180 lib/string.c:863
CPU: 1 PID: 4318 Comm: syz-executor998 Not tainted 4.19.0-rc3+ #49
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x14b/0x190 lib/dump_stack.c:113
kmsan_report+0x183/0x2b0 mm/kmsan/kmsan.c:956
__msan_warning+0x70/0xc0 mm/kmsan/kmsan_instr.c:645
memcmp+0x11d/0x180 lib/string.c:863
dev_uc_add_excl+0x165/0x7b0 net/core/dev_addr_lists.c:464
ndo_dflt_fdb_add net/core/rtnetlink.c:3463 [inline]
rtnl_fdb_add+0x1081/0x1270 net/core/rtnetlink.c:3558
rtnetlink_rcv_msg+0xa0b/0x1530 net/core/rtnetlink.c:4715
netlink_rcv_skb+0x36e/0x5f0 net/netlink/af_netlink.c:2454
rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4733
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x1638/0x1720 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x1205/0x1290 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
__sys_sendmsg net/socket.c:2152 [inline]
__do_sys_sendmsg net/socket.c:2161 [inline]
__se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440ee9
Code: e8 cc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff6a93b518 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440ee9
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000213 R12: 000000000000b4b0
R13: 0000000000401ec0 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:256 [inline]
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:181
kmsan_kmalloc+0x98/0x100 mm/kmsan/kmsan_hooks.c:91
kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:100
slab_post_alloc_hook mm/slab.h:446 [inline]
slab_alloc_node mm/slub.c:2718 [inline]
__kmalloc_node_track_caller+0x9e7/0x1160 mm/slub.c:4351
__kmalloc_reserve net/core/skbuff.c:138 [inline]
__alloc_skb+0x2f5/0x9e0 net/core/skbuff.c:206
alloc_skb include/linux/skbuff.h:996 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
netlink_sendmsg+0xb49/0x1290 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
__sys_sendmsg net/socket.c:2152 [inline]
__do_sys_sendmsg net/socket.c:2161 [inline]
__se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
Fixes: 090096bf3db1 ("net: generic fdb support for drivers without ndo_fdb_<op>")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-and-tested-by: syzbot+3a288d5f5530b901310e@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+d53ab4e92a1db04110ff@syzkaller.appspotmail.com
Cc: Vlad Yasevich <vyasevich@gmail.com>
---
net/core/rtnetlink.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index f679c7a7d761..728a97f9f700 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3600,6 +3600,11 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
}
+ if (dev->type != ARPHRD_ETHER) {
+ NL_SET_ERR_MSG(extack, "invalid device type");
+ return -EINVAL;
+ }
+
addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack);
@@ -3704,6 +3709,11 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
}
+ if (dev->type != ARPHRD_ETHER) {
+ NL_SET_ERR_MSG(extack, "invalid device type");
+ return -EINVAL;
+ }
+
addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack);
--
2.17.2
^ permalink raw reply related
* Re: CAKE and r8169 cause panic on upload in v4.19
From: David Miller @ 2018-10-28 4:44 UTC (permalink / raw)
To: oleksandr
Cc: dave.taht, hkallweit1, toke, jhs, xiyou.wangcong, jiri, netdev,
linux-kernel
In-Reply-To: <b80e6819da8ea74f18b6ec0aaf9128fa@natalenko.name>
From: Oleksandr Natalenko <oleksandr@natalenko.name>
Date: Fri, 26 Oct 2018 22:54:12 +0200
> Next, I've seen GRO bits in the call trace and decided to disable GRO
> on this NIC. So far, I cannot trigger a panic with GRO disabled even
> after 20 rounds of speedtest.
>
> So, must be some generic thing indeed.
Yeah something is out-of-whack with GRO.
Does this fix it?
diff --git a/net/core/dev.c b/net/core/dev.c
index 022ad73d6253..77d43ae2a7bb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5457,7 +5457,7 @@ static void gro_flush_oldest(struct list_head *head)
/* Do not adjust napi->gro_hash[].count, caller is adding a new
* SKB to the chain.
*/
- list_del(&oldest->list);
+ skb_list_del_init(oldest);
napi_gro_complete(oldest);
}
^ permalink raw reply related
* Re: [PATCH v2] sctp: socket.c validate sprstat_policy
From: Xin Long @ 2018-10-28 4:17 UTC (permalink / raw)
To: tomasbortoli
Cc: Vlad Yasevich, Neil Horman, Marcelo Ricardo Leitner, davem,
linux-sctp, network dev, LKML
In-Reply-To: <20181027205320.14975-1-tomasbortoli@gmail.com>
On Sun, Oct 28, 2018 at 5:54 AM Tomas Bortoli <tomasbortoli@gmail.com> wrote:
>
> It is possible to perform out-of-bound reads on
> sctp_getsockopt_pr_streamstatus() and on
> sctp_getsockopt_pr_assocstatus() by passing from userspace a
> sprstat_policy that overflows the abandoned_sent/abandoned_unsent
> fixed length arrays. The over-read data are directly copied/leaked
> to userspace.
>
> Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
> Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
> ---
> net/sctp/socket.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index fc0386e8ff23..14dce5d95817 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -7083,7 +7083,9 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
> }
>
> policy = params.sprstat_policy;
> - if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> + if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> + __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
> + __SCTP_PR_INDEX(policy) < 0)
> goto out;
>
> asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
> @@ -7142,7 +7144,9 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
> }
>
> policy = params.sprstat_policy;
> - if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> + if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> + __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
> + __SCTP_PR_INDEX(policy) < 0)
> goto out;
This is not the correct fix.
See https://lkml.org/lkml/2018/10/27/136
^ permalink raw reply
* 4.19 - tons of hw csum failure errors
From: Nikola Ciprich @ 2018-10-27 19:18 UTC (permalink / raw)
To: netdev; +Cc: nik
Hi,
just wanted to report, thet after switching to 4.19 (fro 4.14.x, so maybe
the problem appeared somewhere between), I'm getting tons of similar
messages:
Oct 27 09:06:27 xxx kernel: br501: hw csum failure
Oct 27 09:06:27 xxx kernel: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G E 4.19.0lb7.00_01_PRE04 #1
Oct 27 09:06:27 xxx kernel: Hardware name: Supermicro Super Server/X11DDW-NT, BIOS 2.0b 03/07/2018
Oct 27 09:06:27 xxx kernel: Call Trace:
Oct 27 09:06:27 xxx kernel: <IRQ>
Oct 27 09:06:27 xxx kernel: dump_stack+0x5a/0x73
Oct 27 09:06:27 xxx kernel: __skb_checksum_complete+0xba/0xc0
Oct 27 09:06:27 xxx kernel: tcp_error+0x108/0x180 [nf_conntrack]
Oct 27 09:06:27 xxx kernel: nf_conntrack_in+0xd2/0x4b0 [nf_conntrack]
Oct 27 09:06:27 xxx kernel: ? csum_partial+0xd/0x20
Oct 27 09:06:27 xxx kernel: nf_hook_slow+0x3d/0xb0
Oct 27 09:06:27 xxx kernel: ip_rcv+0xb5/0xd0
Oct 27 09:06:27 xxx kernel: ? ip_rcv_finish_core.isra.12+0x370/0x370
Oct 27 09:06:27 xxx kernel: __netif_receive_skb_one_core+0x52/0x70
Oct 27 09:06:27 xxx kernel: process_backlog+0xa3/0x150
Oct 27 09:06:27 xxx kernel: net_rx_action+0x2af/0x3f0
Oct 27 09:06:27 xxx kernel: __do_softirq+0xd1/0x28c
Oct 27 09:06:27 xxx kernel: irq_exit+0xde/0xf0
Oct 27 09:06:27 xxx kernel: do_IRQ+0x54/0xe0
Oct 27 09:06:27 xxx kernel: common_interrupt+0xf/0xf
Oct 27 09:06:27 xxx kernel: </IRQ>
Oct 27 09:06:27 xxx kernel: RIP: 0010:cpuidle_enter_state+0xb6/0x2e0
Oct 27 09:06:27 xxx kernel: Code: 7e e8 ee 84 b2 ff 8b 5d 04 49 89 c6 0f 1f 44 00 00 31 ff e8 bc 95 b2 ff 80 7c 24 03 00 0f 85 93 01 00 00 fb 66 0f 1f 44 00 00 <4d> 29 fe 48 ba cf f7 5
3 e3 a5 9b c4 20 4c 89 f0 49 c1 fe 3f 48 f7
Oct 27 09:06:27 xxx kernel: RSP: 0018:ffffc90018b17e88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
Oct 27 09:06:27 xxx kernel: RAX: ffff888faf822600 RBX: 0000000000000008 RCX: 000000000000001f
Oct 27 09:06:27 xxx kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
Oct 27 09:06:27 xxx kernel: RBP: ffffe8ffffa029a8 R08: 0000000000000002 R09: ffe9afdaa39e3efa
Oct 27 09:06:27 xxx kernel: R10: 0000000000000377 R11: 0000000000000008 R12: 0000000000000008
Oct 27 09:06:27 xxx kernel: R13: 0000000000000003 R14: 0000000d6670b7ca R15: 0000000d564db458
Oct 27 09:06:27 xxx kernel: ? cpuidle_enter_state+0xa4/0x2e0
Oct 27 09:06:27 xxx kernel: do_idle+0x1e4/0x290
Oct 27 09:06:27 xxx kernel: cpu_startup_entry+0x6f/0x80
Oct 27 09:06:27 xxx kernel: start_secondary+0x1aa/0x200
Oct 27 09:06:27 xxx kernel: secondary_startup_64+0xa4/0xb0
it's being reported for various kernel threads (swapper, ksoftirqd, ...)
I tried applying
commit db4f1be3ca9b0ef7330763d07bf4ace83ad6f913
Author: Sean Tranchetti <stranche@codeaurora.org>
Date: Tue Oct 23 16:04:31 2018 -0600
net: udp: fix handling of CHECKSUM_COMPLETE packets
but to no avail..
the system is running virtual machines and using openvswitch with
following simple topology:
[root@xxx tmp]# ovs-vsctl show
22519243-4f9e-47dc-ac8c-3635f6595c4d
Bridge brovs
Port brovs
Interface brovs
type: internal
Port "bond0"
Interface "eth2"
Interface "eth3"
Port "vnet0"
tag: 502
Interface "vnet0"
Port brdef
tag: 0
Interface brdef
type: internal
Port "br51"
tag: 51
Interface "br51"
type: internal
Port "br50"
tag: 50
Interface "br50"
type: internal
Port "br501"
tag: 501
Interface "br501"
type: internal
ovs_version: "2.5.0"
is this some known problem? may I provide some additional info?
BR
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
^ permalink raw reply
* Re: [Patch net 09/11] net: hns3: bugfix for handling mailbox while the command queue reinitialized
From: Sergei Shtylyov @ 2018-10-27 19:05 UTC (permalink / raw)
To: Huazhong Tan, davem
Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321
In-Reply-To: <1540608118-27449-10-git-send-email-tanhuazhong@huawei.com>
On 27.10.2018 5:41, Huazhong Tan wrote:
> In a multi-core machine, the mailbox service and reset service
> will be executed at the same time. The reset server will re-initialize
> the commond queue, before that, the mailbox handler can only get some
Command?
> invalid messages.
>
> The HCLGE_STATE_CMD_DISABLE flag means that the command queue is not
> available and needs to be reinitialized. Therefore, when the mailbox
> hanlder recognizes this flag, it should not process the command.
Handler.
>
> Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
> ---
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
> index 04462a3..6ac2fab 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
> @@ -400,6 +400,12 @@ void hclge_mbx_handler(struct hclge_dev *hdev)
>
> /* handle all the mailbox requests in the queue */
> while (!hclge_cmd_crq_empty(&hdev->hw)) {
> + if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
> + dev_warn(&hdev->pdev->dev,
> + "command queue need re-initialize\n");
Needs re-initializing.
[...]
MBR, Sergei
^ permalink raw reply
* Re: [Patch net 05/11] net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring()
From: Sergei Shtylyov @ 2018-10-27 19:02 UTC (permalink / raw)
To: Huazhong Tan, davem
Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321
In-Reply-To: <1540608118-27449-6-git-send-email-tanhuazhong@huawei.com>
Hello!
On 27.10.2018 5:41, Huazhong Tan wrote:
> It is not necessary to reset the queue in the hns3_uninit_all_ring(),
> since the queue is stopped in the down operation, and will be resetted
s/resetted/reset/.
> in the up operaton. And the judgment of the HCLGE_STATE_RST_HANDLING
> flag in the hclge_reset_tqp() is not correct, because we need to reset
> tqp during pf reset, otherwise it may cause queue not be resetted to
Same here.
> working state problem.
>
> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
[...]
MBR, Sergei
^ permalink raw reply
* WARNING in __debug_object_init (3)
From: syzbot @ 2018-10-28 3:18 UTC (permalink / raw)
To: ast, daniel, davem, linux-kernel, netdev, syzkaller-bugs
Hello,
syzbot found the following crash on:
HEAD commit: 8c60c36d0b8c Add linux-next specific files for 20181019
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=100feec5400000
kernel config: https://syzkaller.appspot.com/x/.config?x=8b6d7c4c81535e89
dashboard link: https://syzkaller.appspot.com/bug?extid=6e682caa546b7c96c859
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13579abd400000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13654f6b400000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6e682caa546b7c96c859@syzkaller.appspotmail.com
ODEBUG: object 0000000015e9012c is on stack 00000000115bcb67, but NOT
annotated.
WARNING: CPU: 0 PID: 5594 at lib/debugobjects.c:369
debug_object_is_on_stack lib/debugobjects.c:363 [inline]
WARNING: CPU: 0 PID: 5594 at lib/debugobjects.c:369
__debug_object_init.cold.14+0x51/0xdf lib/debugobjects.c:395
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 5594 Comm: syz-executor740 Not tainted
4.19.0-rc8-next-20181019+ #98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x39d lib/dump_stack.c:113
panic+0x2ad/0x55c kernel/panic.c:188
__warn.cold.8+0x20/0x45 kernel/panic.c:540
report_bug+0x254/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:debug_object_is_on_stack lib/debugobjects.c:363 [inline]
RIP: 0010:__debug_object_init.cold.14+0x51/0xdf lib/debugobjects.c:395
Code: ea 03 80 3c 02 00 75 7c 49 8b 54 24 18 48 89 de 48 c7 c7 c0 f1 40 88
4c 89 85 d0 fd ff ff e8 09 8c d1 fd 4c 8b 85 d0 fd ff ff <0f> 0b e9 09 d6
ff ff 41 83 c4 01 b8 ff ff 37 00 44 89 25 b7 4e 66
RSP: 0018:ffff8801bb387308 EFLAGS: 00010086
RAX: 0000000000000050 RBX: ffff8801bb387af8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816585a5 RDI: 0000000000000005
RBP: ffff8801bb387560 R08: ffff8801cb208a20 R09: ffffed003b5c5008
R10: ffffed003b5c5008 R11: ffff8801dae28047 R12: ffff8801d82ea300
R13: 0000000000069700 R14: ffff8801d82ea300 R15: ffff8801cb208a10
debug_object_init+0x16/0x20 lib/debugobjects.c:432
debug_timer_init kernel/time/timer.c:704 [inline]
debug_init kernel/time/timer.c:757 [inline]
init_timer_key+0xa9/0x480 kernel/time/timer.c:806
sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
__do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
__se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
__x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440259
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc212cf818 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440259
RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401ae0
R13: 0000000000401b70 R14: 0000000000000000 R15: 0000000000000000
======================================================
WARNING: possible circular locking dependency detected
4.19.0-rc8-next-20181019+ #98 Not tainted
------------------------------------------------------
syz-executor740/5594 is trying to acquire lock:
00000000688fcc6b ((console_sem).lock){-.-.}, at: down_trylock+0x13/0x70
kernel/locking/semaphore.c:136
but task is already holding lock:
00000000505ead1b (&obj_hash[i].lock){-.-.}, at:
__debug_object_init+0x127/0x1290 lib/debugobjects.c:384
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&obj_hash[i].lock){-.-.}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
__debug_object_init+0x127/0x1290 lib/debugobjects.c:384
debug_object_init+0x16/0x20 lib/debugobjects.c:432
debug_hrtimer_init kernel/time/hrtimer.c:410 [inline]
debug_init kernel/time/hrtimer.c:458 [inline]
hrtimer_init+0x97/0x490 kernel/time/hrtimer.c:1308
init_dl_task_timer+0x1b/0x50 kernel/sched/deadline.c:1057
__sched_fork+0x2ae/0x590 kernel/sched/core.c:2166
init_idle+0x75/0x740 kernel/sched/core.c:5382
sched_init+0xb33/0xc02 kernel/sched/core.c:6065
start_kernel+0x4be/0xa2b init/main.c:608
x86_64_start_reservations+0x2e/0x30 arch/x86/kernel/head64.c:472
x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
-> #2 (&rq->lock){-.-.}:
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2d/0x40 kernel/locking/spinlock.c:144
rq_lock kernel/sched/sched.h:1127 [inline]
task_fork_fair+0xb0/0x6d0 kernel/sched/fair.c:9768
sched_fork+0x443/0xba0 kernel/sched/core.c:2359
copy_process+0x2585/0x8770 kernel/fork.c:1887
_do_fork+0x1cb/0x11c0 kernel/fork.c:2216
kernel_thread+0x34/0x40 kernel/fork.c:2275
rest_init+0x28/0x372 init/main.c:409
arch_call_rest_init+0xe/0x1b
start_kernel+0x9f0/0xa2b init/main.c:745
x86_64_start_reservations+0x2e/0x30 arch/x86/kernel/head64.c:472
x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
-> #1 (&p->pi_lock){-.-.}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
try_to_wake_up+0xd2/0x12e0 kernel/sched/core.c:1965
wake_up_process+0x10/0x20 kernel/sched/core.c:2129
__up.isra.1+0x1c0/0x2a0 kernel/locking/semaphore.c:262
up+0x13c/0x1c0 kernel/locking/semaphore.c:187
__up_console_sem+0xbe/0x1b0 kernel/printk/printk.c:236
console_unlock+0x80c/0x1190 kernel/printk/printk.c:2432
vprintk_emit+0x391/0x990 kernel/printk/printk.c:1922
vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
printk+0xa7/0xcf kernel/printk/printk.c:1997
check_stack_usage kernel/exit.c:755 [inline]
do_exit.cold.18+0x57/0x16f kernel/exit.c:916
do_group_exit+0x177/0x440 kernel/exit.c:970
__do_sys_exit_group kernel/exit.c:981 [inline]
__se_sys_exit_group kernel/exit.c:979 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:979
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 ((console_sem).lock){-.-.}:
lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
__down_trylock_console_sem+0xae/0x1f0 kernel/printk/printk.c:219
console_trylock+0x15/0xa0 kernel/printk/printk.c:2247
console_trylock_spinning kernel/printk/printk.c:1653 [inline]
vprintk_emit+0x372/0x990 kernel/printk/printk.c:1921
vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
printk+0xa7/0xcf kernel/printk/printk.c:1997
debug_object_is_on_stack lib/debugobjects.c:363 [inline]
__debug_object_init.cold.14+0x4a/0xdf lib/debugobjects.c:395
debug_object_init+0x16/0x20 lib/debugobjects.c:432
debug_timer_init kernel/time/timer.c:704 [inline]
debug_init kernel/time/timer.c:757 [inline]
init_timer_key+0xa9/0x480 kernel/time/timer.c:806
sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
__do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
__se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
__x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of:
(console_sem).lock --> &rq->lock --> &obj_hash[i].lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&obj_hash[i].lock);
lock(&rq->lock);
lock(&obj_hash[i].lock);
lock((console_sem).lock);
*** DEADLOCK ***
1 lock held by syz-executor740/5594:
#0: 00000000505ead1b (&obj_hash[i].lock){-.-.}, at:
__debug_object_init+0x127/0x1290 lib/debugobjects.c:384
stack backtrace:
CPU: 0 PID: 5594 Comm: syz-executor740 Not tainted
4.19.0-rc8-next-20181019+ #98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x39d lib/dump_stack.c:113
print_circular_bug.isra.35.cold.54+0x1bd/0x27d
kernel/locking/lockdep.c:1221
check_prev_add kernel/locking/lockdep.c:1863 [inline]
check_prevs_add kernel/locking/lockdep.c:1976 [inline]
validate_chain kernel/locking/lockdep.c:2347 [inline]
__lock_acquire+0x3399/0x4c20 kernel/locking/lockdep.c:3341
lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
__down_trylock_console_sem+0xae/0x1f0 kernel/printk/printk.c:219
console_trylock+0x15/0xa0 kernel/printk/printk.c:2247
console_trylock_spinning kernel/printk/printk.c:1653 [inline]
vprintk_emit+0x372/0x990 kernel/printk/printk.c:1921
vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
printk+0xa7/0xcf kernel/printk/printk.c:1997
debug_object_is_on_stack lib/debugobjects.c:363 [inline]
__debug_object_init.cold.14+0x4a/0xdf lib/debugobjects.c:395
debug_object_init+0x16/0x20 lib/debugobjects.c:432
debug_timer_init kernel/time/timer.c:704 [inline]
debug_init kernel/time/timer.c:757 [inline]
init_timer_key+0xa9/0x480 kernel/time/timer.c:806
sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
__do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
__se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
__x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440259
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc212cf818 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440259
RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401ae0
R13: 0000000000401b70 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* [PATCH net v4] net/ipv6: Add anycast addresses to a global hashtable
From: Jeff Barnhill @ 2018-10-27 18:02 UTC (permalink / raw)
To: netdev; +Cc: davem, kuznet, yoshfuji, Jeff Barnhill
In-Reply-To: <95cb5670-eaf0-c7af-7e35-bc4f6e68c5ba@gmail.com>
icmp6_send() function is expensive on systems with a large number of
interfaces. Every time it’s called, it has to verify that the source
address does not correspond to an existing anycast address by looping
through every device and every anycast address on the device. This can
result in significant delays for a CPU when there are a large number of
neighbors and ND timers are frequently timing out and calling
neigh_invalidate().
Add anycast addresses to a global hashtable to allow quick searching for
matching anycast addresses. This is based on inet6_addr_lst in addrconf.c.
Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
---
include/net/addrconf.h | 2 +
include/net/if_inet6.h | 8 ++++
net/ipv6/af_inet6.c | 5 +++
net/ipv6/anycast.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 133 insertions(+), 2 deletions(-)
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 14b789a123e7..799af1a037d1 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -317,6 +317,8 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
const struct in6_addr *addr);
bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
const struct in6_addr *addr);
+int anycast_init(void);
+void anycast_cleanup(void);
/* Device notifier */
int register_inet6addr_notifier(struct notifier_block *nb);
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index d7578cf49c3a..a445014b981d 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -142,6 +142,14 @@ struct ipv6_ac_socklist {
struct ipv6_ac_socklist *acl_next;
};
+struct ipv6_ac_addrlist {
+ struct in6_addr acal_addr;
+ possible_net_t acal_pnet;
+ refcount_t acal_users;
+ struct hlist_node acal_lst; /* inet6_acaddr_lst */
+ struct rcu_head rcu;
+};
+
struct ifacaddr6 {
struct in6_addr aca_addr;
struct fib6_info *aca_rt;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 3f4d61017a69..ddc8a6dbfba2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -1001,6 +1001,9 @@ static int __init inet6_init(void)
err = ip6_flowlabel_init();
if (err)
goto ip6_flowlabel_fail;
+ err = anycast_init();
+ if (err)
+ goto anycast_fail;
err = addrconf_init();
if (err)
goto addrconf_fail;
@@ -1091,6 +1094,8 @@ static int __init inet6_init(void)
ipv6_exthdrs_fail:
addrconf_cleanup();
addrconf_fail:
+ anycast_cleanup();
+anycast_fail:
ip6_flowlabel_cleanup();
ip6_flowlabel_fail:
ndisc_late_cleanup();
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 4e0ff7031edd..45585010908a 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -44,8 +44,22 @@
#include <net/checksum.h>
+#define IN6_ADDR_HSIZE_SHIFT 8
+#define IN6_ADDR_HSIZE BIT(IN6_ADDR_HSIZE_SHIFT)
+/* anycast address hash table
+ */
+static struct hlist_head inet6_acaddr_lst[IN6_ADDR_HSIZE];
+static DEFINE_SPINLOCK(acaddr_hash_lock);
+
static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr);
+static u32 inet6_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+ u32 val = ipv6_addr_hash(addr) ^ net_hash_mix(net);
+
+ return hash_32(val, IN6_ADDR_HSIZE_SHIFT);
+}
+
/*
* socket join an anycast group
*/
@@ -204,6 +218,73 @@ void ipv6_sock_ac_close(struct sock *sk)
rtnl_unlock();
}
+static struct ipv6_ac_addrlist *acal_alloc(struct net *net,
+ const struct in6_addr *addr)
+{
+ struct ipv6_ac_addrlist *acal;
+
+ acal = kzalloc(sizeof(*acal), GFP_ATOMIC);
+ if (!acal)
+ return NULL;
+
+ acal->acal_addr = *addr;
+ write_pnet(&acal->acal_pnet, net);
+ refcount_set(&acal->acal_users, 1);
+ INIT_HLIST_NODE(&acal->acal_lst);
+
+ return acal;
+}
+
+static int ipv6_add_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+ unsigned int hash = inet6_acaddr_hash(net, addr);
+ struct ipv6_ac_addrlist *acal;
+ int err = 0;
+
+ spin_lock(&acaddr_hash_lock);
+ hlist_for_each_entry(acal, &inet6_acaddr_lst[hash], acal_lst) {
+ if (!net_eq(read_pnet(&acal->acal_pnet), net))
+ continue;
+ if (ipv6_addr_equal(&acal->acal_addr, addr)) {
+ refcount_inc(&acal->acal_users);
+ goto out;
+ }
+ }
+
+ acal = acal_alloc(net, addr);
+ if (!acal) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ hlist_add_head_rcu(&acal->acal_lst, &inet6_acaddr_lst[hash]);
+
+out:
+ spin_unlock(&acaddr_hash_lock);
+ return err;
+}
+
+static void ipv6_del_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+ unsigned int hash = inet6_acaddr_hash(net, addr);
+ struct ipv6_ac_addrlist *acal;
+
+ spin_lock(&acaddr_hash_lock);
+ hlist_for_each_entry(acal, &inet6_acaddr_lst[hash], acal_lst) {
+ if (!net_eq(read_pnet(&acal->acal_pnet), net))
+ continue;
+ if (ipv6_addr_equal(&acal->acal_addr, addr)) {
+ if (refcount_dec_and_test(&acal->acal_users)) {
+ hlist_del_init_rcu(&acal->acal_lst);
+ kfree_rcu(acal, rcu);
+ }
+ spin_unlock(&acaddr_hash_lock);
+ return;
+ }
+ }
+ spin_unlock(&acaddr_hash_lock);
+}
+
static void aca_get(struct ifacaddr6 *aca)
{
refcount_inc(&aca->aca_refcnt);
@@ -275,6 +356,11 @@ int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr)
err = -ENOMEM;
goto out;
}
+ err = ipv6_add_acaddr_hash(dev_net(idev->dev), addr);
+ if (err) {
+ aca_put(aca);
+ goto out;
+ }
aca->aca_next = idev->ac_list;
idev->ac_list = aca;
@@ -324,6 +410,7 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr)
prev_aca->aca_next = aca->aca_next;
else
idev->ac_list = aca->aca_next;
+ ipv6_del_acaddr_hash(dev_net(idev->dev), &aca->aca_addr);
write_unlock_bh(&idev->lock);
addrconf_leave_solict(idev, &aca->aca_addr);
@@ -350,6 +437,8 @@ void ipv6_ac_destroy_dev(struct inet6_dev *idev)
write_lock_bh(&idev->lock);
while ((aca = idev->ac_list) != NULL) {
idev->ac_list = aca->aca_next;
+ ipv6_del_acaddr_hash(dev_net(idev->dev), &aca->aca_addr);
+
write_unlock_bh(&idev->lock);
addrconf_leave_solict(idev, &aca->aca_addr);
@@ -390,17 +479,23 @@ static bool ipv6_chk_acast_dev(struct net_device *dev, const struct in6_addr *ad
bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
const struct in6_addr *addr)
{
+ unsigned int hash = inet6_acaddr_hash(net, addr);
+ struct ipv6_ac_addrlist *acal;
bool found = false;
rcu_read_lock();
if (dev)
found = ipv6_chk_acast_dev(dev, addr);
else
- for_each_netdev_rcu(net, dev)
- if (ipv6_chk_acast_dev(dev, addr)) {
+ hlist_for_each_entry_rcu(acal, &inet6_acaddr_lst[hash],
+ acal_lst) {
+ if (!net_eq(read_pnet(&acal->acal_pnet), net))
+ continue;
+ if (ipv6_addr_equal(&acal->acal_addr, addr)) {
found = true;
break;
}
+ }
rcu_read_unlock();
return found;
}
@@ -539,4 +634,25 @@ void ac6_proc_exit(struct net *net)
{
remove_proc_entry("anycast6", net->proc_net);
}
+
+/* Init / cleanup code
+ */
+int __init anycast_init(void)
+{
+ int i;
+
+ for (i = 0; i < IN6_ADDR_HSIZE; i++)
+ INIT_HLIST_HEAD(&inet6_acaddr_lst[i]);
+ return 0;
+}
+
+void anycast_cleanup(void)
+{
+ int i;
+
+ spin_lock(&acaddr_hash_lock);
+ for (i = 0; i < IN6_ADDR_HSIZE; i++)
+ WARN_ON(!hlist_empty(&inet6_acaddr_lst[i]));
+ spin_unlock(&acaddr_hash_lock);
+}
#endif
--
2.14.1
^ permalink raw reply related
* RE: [LKP] [tcp] a337531b94: netperf.Throughput_Mbps -6.1% regression
From: Wang, Kemi @ 2018-10-28 1:43 UTC (permalink / raw)
To: Eric Dumazet, Chen, Rong A, Yuchung Cheng
Cc: Soheil Hassas Yeganeh, netdev@vger.kernel.org, LKML, Eric Dumazet,
lkp@01.org, Wei Wang, Neal Cardwell, David S. Miller
In-Reply-To: <e22a09c4-8bbb-e482-6e1e-59ea1111eda3@gmail.com>
Hi, Eric
Thanks for the info.
We rerun the test and verified that this issue has been fixed with commit 041a14d2671573611ffd6412bc16e2f64469f7fb.
Only about 0.1% performance difference was observed.
-----Original Message-----
From: LKP [mailto:lkp-bounces@lists.01.org] On Behalf Of Eric Dumazet
Sent: Wednesday, October 24, 2018 9:27 PM
To: Chen, Rong A <rong.a.chen@intel.com>; Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>; netdev@vger.kernel.org; LKML <linux-kernel@vger.kernel.org>; Eric Dumazet <edumazet@google.com>; lkp@01.org; Wei Wang <weiwan@google.com>; Neal Cardwell <ncardwell@google.com>; David S. Miller <davem@davemloft.net>
Subject: Re: [LKP] [tcp] a337531b94: netperf.Throughput_Mbps -6.1% regression
Hi Rong
This has been reported already, and we believe this has been fixed with :
commit 041a14d2671573611ffd6412bc16e2f64469f7fb
Author: Yuchung Cheng <ycheng@google.com>
Date: Mon Oct 1 15:42:32 2018 -0700
tcp: start receiver buffer autotuning sooner
Previously receiver buffer auto-tuning starts after receiving
one advertised window amount of data. After the initial receiver
buffer was raised by patch a337531b942b ("tcp: up initial rmem to
128KB and SYN rwin to around 64KB"), the reciver buffer may take
too long to start raising. To address this issue, this patch lowers
the initial bytes expected to receive roughly the expected sender's
initial window.
Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thanks
On 10/24/2018 05:13 AM, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:
>
>
> commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master
>
> in testcase: netperf
> on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> with following parameters:
>
> ip: ipv4
> runtime: 900s
> nr_threads: 200%
> cluster: cs-localhost
> test: TCP_STREAM
> ucode: 0x7000013
> cpufreq_governor: performance
>
> test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> test-url: http://www.netperf.org/netperf/
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -1.0% regression |
> | test machine | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory |
> | test parameters | cluster=cs-localhost |
> | | cpufreq_governor=performance |
> | | ip=ipv4 |
> | | nr_threads=200% |
> | | runtime=300s |
> | | send_size=5K |
> | | test=TCP_SENDFILE |
> | | ucode=0x7000013 |
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -5.9% regression |
> | test machine | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory |
> | test parameters | cluster=cs-localhost |
> | | cpufreq_governor=performance |
> | | ip=ipv4 |
> | | nr_threads=200% |
> | | runtime=900s |
> | | test=TCP_MAERTS |
> | | ucode=0x7000013 |
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -3.2% regression |
> | test machine | 4 threads Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz with 4G memory |
> | test parameters | cluster=cs-localhost |
> | | cpufreq_governor=performance |
> | | ip=ipv4 |
> | | nr_threads=200% |
> | | runtime=900s |
> | | test=TCP_MAERTS |
> | | ucode=0x20 |
> +------------------+-------------------------------------------------------------------+
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp run job.yaml
>
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
> cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf/0x7000013
>
> commit:
> 3ff6cde846 ("hns3: Another build fix.")
> a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
>
> 3ff6cde846857d45 a337531b942bd8a03e7052444d
> ---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> :4 50% 2:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
> %stddev %change %stddev
> \ | \
> 2497 -6.1% 2345 netperf.Throughput_Mbps
> 79924 -6.1% 75061 netperf.Throughput_total_Mbps
> 186513 +11.3% 207590 netperf.time.involuntary_context_switches
> 5.488e+08 -6.1% 5.154e+08 netperf.workload
> 1172 ± 34% -37.6% 731.75 ± 5% cpuidle.C1E.usage
> 1137 ± 34% -40.0% 682.25 ± 8% turbostat.C1E
> 2775 ± 11% +17.5% 3261 ± 9% sched_debug.cpu.nr_switches.stddev
> 0.01 ± 17% +28.2% 0.01 ± 10% sched_debug.rt_rq:/.rt_time.avg
> 0.14 ± 17% +28.2% 0.18 ± 10% sched_debug.rt_rq:/.rt_time.max
> 0.03 ± 17% +28.2% 0.04 ± 10% sched_debug.rt_rq:/.rt_time.stddev
> 66336 +0.9% 66948 proc-vmstat.nr_anon_pages
> 2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_hit
> 2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_local
> 2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgalloc_normal
> 2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgfree
> 5.903e+11 -7.9% 5.438e+11 perf-stat.branch-instructions
> 2.68 -0.0 2.64 perf-stat.branch-miss-rate%
> 1.582e+10 -9.2% 1.436e+10 perf-stat.branch-misses
> 6.26e+11 -4.7% 5.964e+11 perf-stat.cache-misses
> 6.26e+11 -4.7% 5.964e+11 perf-stat.cache-references
> 11.69 +8.6% 12.69 perf-stat.cpi
> 123723 +2.1% 126291 perf-stat.cpu-migrations
> 0.09 ± 2% +0.0 0.09 perf-stat.dTLB-load-miss-rate%
> 1.475e+12 -7.1% 1.37e+12 perf-stat.dTLB-loads
> 1.094e+12 -6.9% 1.018e+12 perf-stat.dTLB-stores
> 2.912e+08 ± 5% -13.0% 2.533e+08 perf-stat.iTLB-loads
> 3.019e+12 -7.9% 2.781e+12 perf-stat.instructions
> 0.09 -7.9% 0.08 perf-stat.ipc
> 5500 -1.9% 5394 perf-stat.path-length
> 0.53 ± 2% -0.2 0.38 ± 57% perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
> 0.63 ± 2% -0.1 0.58 ± 4% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
> 0.73 ± 3% +0.1 0.78 ± 2% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
> 0.96 +0.1 1.03 perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
> 98.02 +0.1 98.13 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 97.88 +0.1 98.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.70 ± 3% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.syscall_return_via_sysret
> 0.26 ± 5% -0.0 0.21 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_bh
> 0.28 ± 5% -0.0 0.24 ± 6% perf-profile.children.cycles-pp.lock_sock_nested
> 0.46 ± 4% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.nf_hook_slow
> 0.21 ± 8% -0.0 0.18 ± 5% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
> 0.08 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2
> 0.08 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.ip_finish_output
> 0.17 ± 6% +0.0 0.20 ± 5% perf-profile.children.cycles-pp.tcp_event_new_data_sent
> 0.24 ± 4% +0.0 0.27 ± 2% perf-profile.children.cycles-pp.mod_timer
> 0.15 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.__might_sleep
> 0.80 ± 3% +0.0 0.84 ± 2% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
> 0.30 ± 3% +0.1 0.36 ± 4% perf-profile.children.cycles-pp.__might_fault
> 1.61 ± 4% +0.1 1.69 perf-profile.children.cycles-pp.__release_sock
> 1.06 ± 2% +0.1 1.14 perf-profile.children.cycles-pp.tcp_ack
> 98.24 +0.1 98.36 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 98.09 +0.1 98.23 perf-profile.children.cycles-pp.do_syscall_64
> 70.28 +0.6 70.86 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> 1.56 -0.1 1.48 ± 3% perf-profile.self.cycles-pp.copy_page_to_iter
> 0.70 ± 3% -0.1 0.64 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret
> 1.37 ± 2% -0.1 1.32 ± 2% perf-profile.self.cycles-pp.__free_pages_ok
> 0.55 ± 3% -0.0 0.50 ± 3% perf-profile.self.cycles-pp.__alloc_skb
> 0.44 ± 3% -0.0 0.40 ± 5% perf-profile.self.cycles-pp.tcp_recvmsg
> 0.16 ± 9% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.sock_has_perm
> 0.08 ± 6% -0.0 0.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2
> 0.10 ± 4% +0.0 0.12 ± 6% perf-profile.self.cycles-pp.tcp_clean_rtx_queue
> 0.14 ± 6% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__might_sleep
> 69.25 +0.5 69.77 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>
>
>
> netperf.Throughput_Mbps
>
> 3000 +-+------------------------------------------------------------------+
> | |
> 2500 +-+..+.+..+.+..+.+..+.+..+.+..+.+..+.+.+..+.+..+.+..+.+..+.+..+.+..+.|
> O O O O O O O O O O O O O O O O O O O O O O O O O |
> | : |
> 2000 +-+ |
> |: |
> 1500 +-+ |
> |: |
> 1000 +-+ |
> |: |
> |: |
> 500 +-+ |
> | |
> 0 +-+------------------------------------------------------------------+
>
>
> netperf.Throughput_total_Mbps
>
> 90000 +-+-----------------------------------------------------------------+
> | |
> 80000 O-O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..+.+..+.|
> 70000 +-+ |
> | : |
> 60000 +-+ |
> 50000 +-+ |
> |: |
> 40000 +-+ |
> 30000 +-+ |
> |: |
> 20000 +-+ |
> 10000 +-+ |
> | |
> 0 +-+-----------------------------------------------------------------+
>
>
> netperf.workload
>
> 6e+08 +-+-----------------------------------------------------------------+
> | +..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|
> 5e+08 O-O O O O O O O O O O O O O O O O O O O O O O O O |
> | : |
> | : |
> 4e+08 +-+ |
> |: |
> 3e+08 +-+ |
> |: |
> 2e+08 +-+ |
> |: |
> | |
> 1e+08 +-+ |
> | |
> 0 +-+-----------------------------------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
> ***************************************************************************************************
> lkp-bdw-de1: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/send_size/tbox_group/test/testcase/ucode:
> cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/300s/5K/lkp-bdw-de1/TCP_SENDFILE/netperf/0x7000013
>
> commit:
> 3ff6cde846 ("hns3: Another build fix.")
> a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
>
> 3ff6cde846857d45 a337531b942bd8a03e7052444d
> ---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> 1:4 -25% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
> %stddev %change %stddev
> \ | \
> 5211 -1.0% 5160 netperf.Throughput_Mbps
> 166777 -1.0% 165138 netperf.Throughput_total_Mbps
> 1268 -1.6% 1247 netperf.time.percent_of_cpu_this_job_got
> 3539 -1.6% 3481 netperf.time.system_time
> 282.77 -1.5% 278.54 netperf.time.user_time
> 1435875 -1.0% 1421780 netperf.time.voluntary_context_switches
> 1.222e+09 -1.0% 1.21e+09 netperf.workload
> 22728 -1.3% 22437 vmstat.system.cs
> 1218263 ± 3% -5.6% 1150027 ± 4% proc-vmstat.pgalloc_normal
> 1197588 ± 4% -6.0% 1125684 ± 4% proc-vmstat.pgfree
> 3424 ± 17% -28.2% 2456 ± 21% sched_debug.cpu.nr_load_updates.stddev
> 9.00 ± 11% -19.9% 7.21 ± 11% sched_debug.cpu.nr_uninterruptible.max
> 35344728 ± 33% -94.5% 1954598 ±144% cpuidle.C3.time
> 79217 ± 32% -95.5% 3571 ±115% cpuidle.C3.usage
> 13342584 ± 19% +253.4% 47153200 ± 34% cpuidle.C6.time
> 17886 ± 21% +185.8% 51115 ± 34% cpuidle.C6.usage
> 4295 ± 24% +108.0% 8934 ± 53% cpuidle.POLL.time
> 79180 ± 32% -95.6% 3487 ±118% turbostat.C3
> 0.73 ± 32% -0.7 0.04 ±144% turbostat.C3%
> 17693 ± 21% +187.9% 50931 ± 34% turbostat.C6
> 0.27 ± 19% +0.7 0.97 ± 34% turbostat.C6%
> 0.35 ± 30% -89.9% 0.04 ±173% turbostat.CPU%c3
> 0.08 ± 6% +693.3% 0.59 ± 38% turbostat.CPU%c6
> 2.95 +3.1% 3.04 turbostat.RAMWatt
> 1.711e+12 -1.3% 1.689e+12 perf-stat.branch-instructions
> 5.345e+10 -1.2% 5.283e+10 perf-stat.branch-misses
> 9.417e+10 +16.7% 1.099e+11 perf-stat.cache-misses
> 9.417e+10 +16.7% 1.099e+11 perf-stat.cache-references
> 6927335 -1.1% 6849494 perf-stat.context-switches
> 2.936e+12 -1.3% 2.899e+12 perf-stat.dTLB-loads
> 1.796e+12 -1.3% 1.773e+12 perf-stat.dTLB-stores
> 80.43 +3.5 83.95 perf-stat.iTLB-load-miss-rate%
> 3.809e+09 ± 4% -4.7% 3.629e+09 ± 2% perf-stat.iTLB-load-misses
> 9.248e+08 ± 3% -25.0% 6.934e+08 perf-stat.iTLB-loads
> 8.835e+12 -1.3% 8.719e+12 perf-stat.instructions
> 69.17 -1.1 68.08 perf-profile.calltrace.cycles-pp.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 65.80 -1.0 64.79 perf-profile.calltrace.cycles-pp.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 55.88 -0.8 55.04 perf-profile.calltrace.cycles-pp.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 52.32 -0.8 51.56 perf-profile.calltrace.cycles-pp.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64
> 35.71 -0.6 35.11 perf-profile.calltrace.cycles-pp.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
> 34.84 -0.6 34.26 perf-profile.calltrace.cycles-pp.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile
> 33.94 -0.5 33.41 perf-profile.calltrace.cycles-pp.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct
> 26.16 -0.5 25.70 perf-profile.calltrace.cycles-pp.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage.pipe_to_sendpage
> 30.02 -0.5 29.55 perf-profile.calltrace.cycles-pp.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
> 28.77 -0.4 28.34 perf-profile.calltrace.cycles-pp.sock_sendpage.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe.direct_splice_actor
> 27.68 -0.4 27.27 perf-profile.calltrace.cycles-pp.inet_sendpage.kernel_sendpage.sock_sendpage.pipe_to_sendpage.__splice_from_pipe
> 27.98 -0.4 27.58 perf-profile.calltrace.cycles-pp.kernel_sendpage.sock_sendpage.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe
> 20.30 -0.3 19.95 perf-profile.calltrace.cycles-pp.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage
> 19.49 -0.3 19.16 perf-profile.calltrace.cycles-pp.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage
> 9.78 -0.2 9.53 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage
> 9.94 -0.2 9.70 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage
> 6.32 -0.2 6.09 perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked
> 5.59 -0.2 5.42 perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages
> 5.19 -0.2 5.02 perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
> 4.79 -0.2 4.62 perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_one_core.process_backlog.net_rx_action.__softirqentry_text_start
> 5.51 -0.2 5.35 perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2
> 5.00 -0.2 4.84 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack
> 5.52 -0.2 5.36 perf-profile.calltrace.cycles-pp.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output
> 5.37 -0.2 5.21 perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip
> 4.68 -0.2 4.53 perf-profile.calltrace.cycles-pp.security_file_permission.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 5.61 -0.2 5.46 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output.__ip_queue_xmit
> 5.21 -0.2 5.06 perf-profile.calltrace.cycles-pp.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq
> 4.58 -0.2 4.42 perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
> 5.66 -0.2 5.50 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.ip_output.__ip_queue_xmit.__tcp_transmit_skb
> 4.39 -0.2 4.24 perf-profile.calltrace.cycles-pp.__entry_SYSCALL_64_trampoline
> 2.87 ± 2% -0.1 2.76 perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.do_sendfile.__x64_sys_sendfile64.do_syscall_64
> 1.25 ± 3% -0.1 1.15 perf-profile.calltrace.cycles-pp.__inode_security_revalidate.selinux_file_permission.security_file_permission.do_sendfile.__x64_sys_sendfile64
> 4.30 -0.1 4.20 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog
> 1.86 -0.1 1.77 ± 3% perf-profile.calltrace.cycles-pp.release_sock.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage
> 1.14 -0.1 1.08 ± 2% perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.do_splice_direct.do_sendfile.__x64_sys_sendfile64
> 0.69 -0.1 0.63 perf-profile.calltrace.cycles-pp.tcp_release_cb.release_sock.tcp_sendpage.inet_sendpage.kernel_sendpage
> 0.61 ± 2% -0.1 0.56 ± 2% perf-profile.calltrace.cycles-pp.__might_fault.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.61 ± 2% -0.0 0.57 ± 4% perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.do_splice_direct.do_sendfile
> 0.57 ± 2% +0.0 0.61 ± 2% perf-profile.calltrace.cycles-pp.___might_sleep.__might_fault.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg
> 90.63 +0.2 90.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 91.39 +0.2 91.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 20.12 +1.3 21.46 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 20.10 +1.3 21.44 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 19.84 +1.4 21.24 perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
> 19.89 +1.4 21.30 perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 15.07 +1.6 16.65 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
> 14.25 +1.6 15.82 perf-profile.calltrace.cycles-pp.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
> 11.15 +1.6 12.74 perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg
> 10.84 +1.6 12.45 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg
> 69.33 -1.1 68.23 perf-profile.children.cycles-pp.__x64_sys_sendfile64
> 65.94 -1.0 64.92 perf-profile.children.cycles-pp.do_sendfile
> 55.98 -0.8 55.14 perf-profile.children.cycles-pp.do_splice_direct
> 52.38 -0.8 51.60 perf-profile.children.cycles-pp.splice_direct_to_actor
> 35.77 -0.6 35.16 perf-profile.children.cycles-pp.direct_splice_actor
> 34.91 -0.6 34.33 perf-profile.children.cycles-pp.splice_from_pipe
> 34.07 -0.5 33.53 perf-profile.children.cycles-pp.__splice_from_pipe
> 30.09 -0.5 29.62 perf-profile.children.cycles-pp.pipe_to_sendpage
> 26.31 -0.5 25.86 perf-profile.children.cycles-pp.tcp_sendpage
> 28.85 -0.4 28.42 perf-profile.children.cycles-pp.sock_sendpage
> 27.75 -0.4 27.33 perf-profile.children.cycles-pp.inet_sendpage
> 28.05 -0.4 27.65 perf-profile.children.cycles-pp.kernel_sendpage
> 20.38 -0.3 20.03 perf-profile.children.cycles-pp.tcp_sendpage_locked
> 19.62 -0.3 19.29 perf-profile.children.cycles-pp.do_tcp_sendpages
> 9.69 -0.3 9.42 perf-profile.children.cycles-pp.security_file_permission
> 8.60 -0.2 8.38 perf-profile.children.cycles-pp.__tcp_transmit_skb
> 10.66 -0.2 10.43 perf-profile.children.cycles-pp.tcp_write_xmit
> 10.79 -0.2 10.56 perf-profile.children.cycles-pp.__tcp_push_pending_frames
> 7.82 -0.2 7.64 perf-profile.children.cycles-pp.__ip_queue_xmit
> 7.38 -0.2 7.20 perf-profile.children.cycles-pp.ip_output
> 6.36 -0.2 6.19 perf-profile.children.cycles-pp.__local_bh_enable_ip
> 5.95 -0.2 5.78 perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
> 4.86 -0.2 4.69 perf-profile.children.cycles-pp.ip_rcv
> 5.07 -0.2 4.91 perf-profile.children.cycles-pp.__netif_receive_skb_one_core
> 5.44 -0.2 5.29 perf-profile.children.cycles-pp.net_rx_action
> 5.58 -0.2 5.42 perf-profile.children.cycles-pp.do_softirq_own_stack
> 5.28 -0.2 5.13 perf-profile.children.cycles-pp.process_backlog
> 6.70 -0.2 6.55 perf-profile.children.cycles-pp.ip_finish_output2
> 5.67 -0.1 5.52 perf-profile.children.cycles-pp.do_softirq
> 2.76 ± 3% -0.1 2.62 perf-profile.children.cycles-pp.__inode_security_revalidate
> 1.39 ± 4% -0.1 1.27 ± 2% perf-profile.children.cycles-pp._cond_resched
> 4.45 -0.1 4.34 perf-profile.children.cycles-pp.ip_local_deliver
> 0.73 ± 5% -0.1 0.64 ± 3% perf-profile.children.cycles-pp.rcu_all_qs
> 0.72 -0.1 0.65 perf-profile.children.cycles-pp.tcp_release_cb
> 0.30 ± 5% -0.1 0.24 ± 3% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
> 0.43 ± 4% -0.0 0.39 ± 5% perf-profile.children.cycles-pp.copy_user_generic_unrolled
> 0.17 ± 7% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.ip_rcv_finish_core
> 0.19 ± 7% -0.0 0.15 ± 6% perf-profile.children.cycles-pp.ip_rcv_finish
> 0.14 ± 5% -0.0 0.11 ± 8% perf-profile.children.cycles-pp.tcp_rearm_rto
> 0.10 ± 11% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.sockfd_lookup_light
> 0.07 ± 5% +0.0 0.09 ± 5% perf-profile.children.cycles-pp.skb_entail
> 0.11 ± 3% +0.0 0.13 ± 6% perf-profile.children.cycles-pp.scheduler_tick
> 0.51 ± 3% +0.0 0.55 ± 3% perf-profile.children.cycles-pp.tcp_established_options
> 90.70 +0.2 90.90 perf-profile.children.cycles-pp.do_syscall_64
> 91.47 +0.2 91.70 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 20.13 +1.3 21.47 perf-profile.children.cycles-pp.__x64_sys_recvfrom
> 20.10 +1.3 21.44 perf-profile.children.cycles-pp.__sys_recvfrom
> 19.89 +1.4 21.30 perf-profile.children.cycles-pp.inet_recvmsg
> 19.84 +1.4 21.26 perf-profile.children.cycles-pp.tcp_recvmsg
> 16.63 +1.6 18.19 perf-profile.children.cycles-pp.copy_page_to_iter
> 15.08 +1.6 16.66 perf-profile.children.cycles-pp.skb_copy_datagram_iter
> 11.24 +1.6 12.82 perf-profile.children.cycles-pp.copyout
> 11.24 +1.6 12.82 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> 5.68 -0.2 5.51 perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
> 0.67 -0.1 0.60 ± 2% perf-profile.self.cycles-pp.tcp_release_cb
> 0.93 ± 2% -0.1 0.86 ± 2% perf-profile.self.cycles-pp.__inode_security_revalidate
> 1.09 ± 2% -0.0 1.05 ± 2% perf-profile.self.cycles-pp.do_syscall_64
> 0.16 ± 9% -0.0 0.12 ± 7% perf-profile.self.cycles-pp.ip_rcv_finish_core
> 0.09 ± 11% -0.0 0.05 ± 62% perf-profile.self.cycles-pp.__tcp_ack_snd_check
> 0.40 ± 3% -0.0 0.36 ± 7% perf-profile.self.cycles-pp.copy_user_generic_unrolled
> 0.80 -0.0 0.77 ± 2% perf-profile.self.cycles-pp.current_time
> 0.28 ± 2% -0.0 0.25 ± 3% perf-profile.self.cycles-pp.tcp_recvmsg
> 0.27 ± 6% -0.0 0.24 ± 5% perf-profile.self.cycles-pp.__alloc_skb
> 0.18 ± 6% -0.0 0.15 ± 7% perf-profile.self.cycles-pp.tcp_mstamp_refresh
> 0.10 ± 5% -0.0 0.08 ± 5% perf-profile.self.cycles-pp.__tcp_select_window
> 0.22 ± 3% +0.0 0.24 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.46 ± 5% +0.0 0.51 ± 4% perf-profile.self.cycles-pp.tcp_established_options
> 11.14 +1.5 12.68 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>
>
>
> ***************************************************************************************************
> lkp-bdw-de1: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
> cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_MAERTS/netperf/0x7000013
>
> commit:
> 3ff6cde846 ("hns3: Another build fix.")
> a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
>
> 3ff6cde846857d45 a337531b942bd8a03e7052444d
> ---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> 1:4 2% 1:4 perf-profile.children.cycles-pp.schedule_timeout
> %stddev %change %stddev
> \ | \
> 2497 -5.9% 2349 netperf.Throughput_Mbps
> 79914 -5.9% 75172 netperf.Throughput_total_Mbps
> 2472 +4.7% 2588 netperf.time.maximum_resident_set_size
> 8998 +8.0% 9715 netperf.time.minor_page_faults
> 88.91 -13.7% 76.77 netperf.time.user_time
> 5.487e+08 -5.9% 5.162e+08 netperf.workload
> 50507215 ± 49% -63.0% 18671277 ± 27% cpuidle.C3.time
> 111760 ± 6% +12.4% 125584 ± 3% meminfo.DirectMap4k
> 0.35 ± 49% -0.2 0.13 ± 29% turbostat.C3%
> 42.19 -1.2% 41.70 turbostat.PkgWatt
> 1988 +9.6% 2180 ± 2% sched_debug.cfs_rq:/.util_est_enqueued.max
> 401.62 ± 3% +11.2% 446.64 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.stddev
> 3.91 ± 12% -18.4% 3.19 ± 14% sched_debug.cpu.nr_uninterruptible.stddev
> 697.25 ± 4% +48.3% 1034 ± 19% slabinfo.dmaengine-unmap-16.active_objs
> 697.25 ± 4% +48.3% 1034 ± 19% slabinfo.dmaengine-unmap-16.num_objs
> 1464 ± 11% -20.9% 1157 ± 9% slabinfo.skbuff_head_cache.active_objs
> 1464 ± 11% -20.9% 1157 ± 9% slabinfo.skbuff_head_cache.num_objs
> 70462 +1.3% 71390 proc-vmstat.nr_active_anon
> 66190 +1.5% 67154 proc-vmstat.nr_anon_pages
> 70462 +1.3% 71390 proc-vmstat.nr_zone_active_anon
> 2.756e+08 -6.0% 2.592e+08 proc-vmstat.numa_hit
> 2.756e+08 -6.0% 2.592e+08 proc-vmstat.numa_local
> 2.197e+09 -6.0% 2.067e+09 proc-vmstat.pgalloc_normal
> 2.197e+09 -6.0% 2.066e+09 proc-vmstat.pgfree
> 5.831e+11 -7.8% 5.377e+11 perf-stat.branch-instructions
> 1.567e+10 -8.9% 1.428e+10 perf-stat.branch-misses
> 6.246e+11 -4.4% 5.974e+11 perf-stat.cache-misses
> 6.246e+11 -4.4% 5.974e+11 perf-stat.cache-references
> 11.79 +8.4% 12.78 perf-stat.cpi
> 122574 +2.4% 125502 perf-stat.cpu-migrations
> 1.473e+12 -7.0% 1.369e+12 perf-stat.dTLB-loads
> 0.07 ± 13% +0.0 0.09 ± 6% perf-stat.dTLB-store-miss-rate%
> 7.83e+08 ± 13% +15.6% 9.049e+08 ± 6% perf-stat.dTLB-store-misses
> 1.092e+12 -6.8% 1.017e+12 perf-stat.dTLB-stores
> 1.153e+09 -10.1% 1.037e+09 perf-stat.iTLB-load-misses
> 2.66e+08 ± 4% -7.0% 2.474e+08 perf-stat.iTLB-loads
> 2.994e+12 -7.8% 2.761e+12 perf-stat.instructions
> 0.08 -7.8% 0.08 perf-stat.ipc
> 5456 -2.0% 5348 perf-stat.path-length
> 2.62 -0.1 2.49 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
> 2.64 -0.1 2.51 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
> 2.83 -0.1 2.73 perf-profile.calltrace.cycles-pp.__free_pages_ok.skb_release_data.__kfree_skb.tcp_recvmsg.inet_recvmsg
> 3.64 -0.1 3.54 perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
> 3.27 -0.1 3.18 perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
> 98.03 +0.1 98.11 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 97.89 +0.1 97.96 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.44 ± 58% +0.3 0.71 ± 5% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout.copy_page_to_iter
> 2.92 ± 6% +0.4 3.29 ± 4% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.skb_copy_datagram_iter
> 0.00 +0.5 0.55 ± 6% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout
> 3.64 -0.1 3.52 perf-profile.children.cycles-pp.tcp_write_xmit
> 3.60 -0.1 3.48 perf-profile.children.cycles-pp.__tcp_push_pending_frames
> 2.84 -0.1 2.74 perf-profile.children.cycles-pp.__free_pages_ok
> 4.08 -0.1 4.00 perf-profile.children.cycles-pp.__kfree_skb
> 0.80 ± 2% -0.1 0.74 ± 3% perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
> 0.23 ± 4% -0.0 0.20 ± 5% perf-profile.children.cycles-pp.__sk_mem_schedule
> 0.22 ± 4% -0.0 0.19 ± 5% perf-profile.children.cycles-pp.__sk_mem_raise_allocated
> 0.06 -0.0 0.04 ± 57% perf-profile.children.cycles-pp.tcp_release_cb
> 0.08 ± 6% -0.0 0.06 ± 15% perf-profile.children.cycles-pp.__tcp_select_window
> 0.23 +0.0 0.24 ± 2% perf-profile.children.cycles-pp.__tcp_send_ack
> 0.06 ± 11% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.___perf_sw_event
> 0.06 ± 14% +0.0 0.09 ± 13% perf-profile.children.cycles-pp.tcp_write_timer_handler
> 0.12 ± 7% +0.0 0.15 ± 5% perf-profile.children.cycles-pp.update_curr
> 0.06 ± 11% +0.0 0.09 ± 17% perf-profile.children.cycles-pp.call_timer_fn
> 0.17 ± 4% +0.0 0.20 ± 3% perf-profile.children.cycles-pp.___slab_alloc
> 0.18 ± 4% +0.0 0.21 ± 3% perf-profile.children.cycles-pp.__slab_alloc
> 0.05 ± 58% +0.0 0.08 ± 15% perf-profile.children.cycles-pp.tcp_write_timer
> 0.04 ± 58% +0.0 0.08 ± 16% perf-profile.children.cycles-pp.tcp_send_loss_probe
> 0.32 ± 3% +0.0 0.35 perf-profile.children.cycles-pp.kmem_cache_alloc_node
> 0.14 ± 7% +0.0 0.19 ± 16% perf-profile.children.cycles-pp.preempt_schedule_common
> 0.21 ± 12% +0.1 0.27 ± 6% perf-profile.children.cycles-pp.task_tick_fair
> 0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.__tcp_retransmit_skb
> 0.51 ± 3% +0.1 0.57 ± 6% perf-profile.children.cycles-pp.__sched_text_start
> 1.61 +0.1 1.68 ± 2% perf-profile.children.cycles-pp.__release_sock
> 1.06 ± 3% +0.1 1.14 ± 2% perf-profile.children.cycles-pp.tcp_ack
> 0.28 ± 9% +0.1 0.36 ± 4% perf-profile.children.cycles-pp.scheduler_tick
> 98.09 +0.1 98.18 perf-profile.children.cycles-pp.do_syscall_64
> 98.23 +0.1 98.32 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 0.49 ± 8% +0.1 0.58 ± 5% perf-profile.children.cycles-pp.update_process_times
> 0.50 ± 8% +0.1 0.61 ± 6% perf-profile.children.cycles-pp.tick_sched_handle
> 0.54 ± 9% +0.1 0.67 ± 5% perf-profile.children.cycles-pp.tick_sched_timer
> 0.79 ± 8% +0.1 0.93 ± 3% perf-profile.children.cycles-pp.__hrtimer_run_queues
> 0.93 ± 9% +0.2 1.09 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt
> 1.13 ± 10% +0.2 1.37 ± 4% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
> 2.51 ± 6% +0.4 2.87 ± 3% perf-profile.children.cycles-pp.apic_timer_interrupt
> 70.21 +0.4 70.63 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> 1.61 -0.1 1.49 ± 2% perf-profile.self.cycles-pp.copy_page_to_iter
> 0.78 ± 2% -0.1 0.72 ± 3% perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
> 1.37 -0.1 1.32 perf-profile.self.cycles-pp.__free_pages_ok
> 0.21 ± 5% -0.0 0.18 ± 4% perf-profile.self.cycles-pp.__sk_mem_raise_allocated
> 0.65 ± 2% -0.0 0.62 perf-profile.self.cycles-pp.free_one_page
> 0.41 ± 2% -0.0 0.39 ± 4% perf-profile.self.cycles-pp.skb_copy_datagram_iter
> 0.08 ± 6% -0.0 0.06 ± 15% perf-profile.self.cycles-pp.__tcp_select_window
> 0.10 ± 5% -0.0 0.08 ± 8% perf-profile.self.cycles-pp.import_single_range
> 0.14 ± 5% +0.0 0.16 ± 5% perf-profile.self.cycles-pp.___slab_alloc
> 0.19 ± 3% +0.0 0.21 ± 3% perf-profile.self.cycles-pp.kmem_cache_alloc_node
> 0.15 ± 4% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__might_sleep
> 0.03 ±100% +0.0 0.07 ± 13% perf-profile.self.cycles-pp.___perf_sw_event
>
>
>
> ***************************************************************************************************
> lkp-u410: 4 threads Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz with 4G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
> cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-u410/TCP_MAERTS/netperf/0x20
>
> commit:
> 3ff6cde846 ("hns3: Another build fix.")
> a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
>
> 3ff6cde846857d45 a337531b942bd8a03e7052444d
> ---------------- --------------------------
> fail:runs %reproduction fail:runs
> | | |
> 4:4 -100% :4 dmesg.RIP:intel_modeset_init[i915]
> 4:4 -100% :4 dmesg.WARNING:at_drivers/gpu/drm/i915/intel_display.c:#intel_modeset_init[i915]
> 2:4 -3% 2:4 perf-profile.children.cycles-pp.schedule_timeout
> %stddev %change %stddev
> \ | \
> 3879 -3.2% 3753 netperf.Throughput_Mbps
> 31036 -3.2% 30030 netperf.Throughput_total_Mbps
> 2463 +3.6% 2552 netperf.time.maximum_resident_set_size
> 2499 +7.5% 2685 netperf.time.minor_page_faults
> 24.96 -14.8% 21.28 ± 8% netperf.time.user_time
> 543040 ± 13% -15.9% 456816 ± 2% netperf.time.voluntary_context_switches
> 2.131e+08 -3.2% 2.062e+08 netperf.workload
> 21274 +3.3% 21986 interrupts.CAL:Function_call_interrupts
> 826.00 ± 6% -27.1% 602.00 ± 23% slabinfo.skbuff_head_cache.active_objs
> 3904 ± 2% -4.5% 3728 vmstat.system.cs
> 56.50 ± 2% +8.8% 61.50 ± 5% turbostat.CoreTmp
> 56.75 ± 2% +8.4% 61.50 ± 5% turbostat.PkgTmp
> 4224 ±173% +294.2% 16653 ± 52% sched_debug.cfs_rq:/.spread0.avg
> 110.92 ± 8% -22.2% 86.34 ± 10% sched_debug.cfs_rq:/.util_avg.stddev
> 896147 ± 3% -11.3% 795033 ± 4% sched_debug.cpu.avg_idle.max
> 162406 ± 9% -26.1% 119960 ± 21% sched_debug.cpu.avg_idle.stddev
> 59886 ± 3% -3.8% 57590 proc-vmstat.nr_dirty_background_threshold
> 119920 ± 3% -3.8% 115322 proc-vmstat.nr_dirty_threshold
> 628429 ± 3% -3.7% 605425 proc-vmstat.nr_free_pages
> 1.071e+08 -3.2% 1.036e+08 proc-vmstat.numa_hit
> 1.071e+08 -3.2% 1.036e+08 proc-vmstat.numa_local
> 8.503e+08 -3.2% 8.229e+08 proc-vmstat.pgfree
> 2.265e+11 -5.7% 2.135e+11 perf-stat.branch-instructions
> 3.01 -0.1 2.94 perf-stat.branch-miss-rate%
> 6.809e+09 -7.8% 6.279e+09 ± 3% perf-stat.branch-misses
> 30.13 +2.0 32.13 perf-stat.cache-miss-rate%
> 5.149e+10 +3.2% 5.314e+10 perf-stat.cache-misses
> 1.709e+11 -3.2% 1.654e+11 perf-stat.cache-references
> 3532029 ± 2% -4.5% 3373137 perf-stat.context-switches
> 7.31 +6.2% 7.76 perf-stat.cpi
> 5.633e+09 ± 2% -5.8% 5.308e+09 perf-stat.dTLB-load-misses
> 7.264e+11 -4.1% 6.964e+11 perf-stat.dTLB-loads
> 6.35e+11 -4.0% 6.097e+11 perf-stat.dTLB-stores
> 4.029e+08 -7.1% 3.743e+08 ± 2% perf-stat.iTLB-load-misses
> 1.157e+12 -5.7% 1.091e+12 perf-stat.instructions
> 0.14 -5.8% 0.13 perf-stat.ipc
> 5426 -2.5% 5289 perf-stat.path-length
> 1.16 ± 6% -0.2 0.99 ± 3% perf-profile.calltrace.cycles-pp.__entry_SYSCALL_64_trampoline
> 0.99 ± 6% -0.1 0.88 ± 10% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
> 96.58 +0.3 96.87 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 26.12 ± 2% +1.3 27.40 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg
> 26.39 ± 2% +1.3 27.69 perf-profile.calltrace.cycles-pp.copyin._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
> 27.12 ± 3% +1.4 28.48 perf-profile.calltrace.cycles-pp._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
> 41.73 ± 2% +1.7 43.40 ± 2% perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
> 43.17 ± 2% +1.7 44.87 ± 2% perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
> 43.75 ± 2% +1.8 45.51 perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 44.88 ± 2% +1.8 46.63 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 44.73 ± 2% +1.8 46.53 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.38 ± 6% -0.2 1.20 ± 3% perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
> 0.42 ± 9% -0.1 0.31 ± 9% perf-profile.children.cycles-pp.tcp_queue_rcv
> 0.79 ± 6% -0.1 0.68 ± 5% perf-profile.children.cycles-pp.ktime_get_with_offset
> 0.32 ± 12% -0.1 0.21 ± 33% perf-profile.children.cycles-pp.scheduler_tick
> 0.35 ± 12% -0.1 0.26 ± 11% perf-profile.children.cycles-pp.tcp_try_coalesce
> 0.29 ± 10% -0.1 0.20 ± 17% perf-profile.children.cycles-pp.skb_try_coalesce
> 0.88 ± 2% -0.1 0.79 ± 4% perf-profile.children.cycles-pp.tcp_mstamp_refresh
> 0.32 ± 9% -0.1 0.26 ± 18% perf-profile.children.cycles-pp.ip_local_out
> 0.41 ± 3% +0.0 0.45 ± 4% perf-profile.children.cycles-pp.selinux_ip_postroute
> 0.03 ±102% +0.1 0.09 ± 24% perf-profile.children.cycles-pp.lock_timer_base
> 0.00 +0.1 0.08 ± 29% perf-profile.children.cycles-pp.raw_local_deliver
> 0.57 ± 4% +0.1 0.66 ± 7% perf-profile.children.cycles-pp.tcp_event_new_data_sent
> 0.20 ± 28% +0.1 0.29 ± 21% perf-profile.children.cycles-pp._cond_resched
> 64.27 +0.5 64.78 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> 26.41 ± 2% +1.3 27.70 perf-profile.children.cycles-pp.copyin
> 27.16 ± 3% +1.3 28.50 perf-profile.children.cycles-pp._copy_from_iter_full
> 41.76 ± 2% +1.7 43.44 ± 2% perf-profile.children.cycles-pp.tcp_sendmsg_locked
> 43.19 ± 2% +1.7 44.88 ± 2% perf-profile.children.cycles-pp.tcp_sendmsg
> 44.88 ± 2% +1.8 46.65 perf-profile.children.cycles-pp.__x64_sys_sendto
> 43.75 ± 2% +1.8 45.51 perf-profile.children.cycles-pp.sock_sendmsg
> 44.74 ± 2% +1.8 46.54 perf-profile.children.cycles-pp.__sys_sendto
> 1.21 ± 8% -0.2 0.99 ± 5% perf-profile.self.cycles-pp.copy_page_to_iter
> 1.32 ± 6% -0.2 1.15 ± 3% perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
> 0.29 ± 9% -0.1 0.20 ± 18% perf-profile.self.cycles-pp.skb_try_coalesce
> 0.50 ± 9% -0.1 0.42 ± 10% perf-profile.self.cycles-pp.ktime_get_with_offset
> 0.19 ± 14% -0.1 0.12 ± 10% perf-profile.self.cycles-pp.__local_bh_enable_ip
> 0.08 ± 10% -0.0 0.03 ±102% perf-profile.self.cycles-pp.selinux_sock_rcv_skb_compat
> 0.13 ± 3% -0.0 0.08 ± 57% perf-profile.self.cycles-pp.__x64_sys_sendto
> 0.07 ± 12% -0.0 0.03 ±100% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
> 0.11 ± 11% -0.0 0.08 ± 22% perf-profile.self.cycles-pp.__sys_recvfrom
> 0.05 ± 61% +0.0 0.09 ± 11% perf-profile.self.cycles-pp.selinux_ip_postroute
> 0.09 ± 20% +0.1 0.15 ± 31% perf-profile.self.cycles-pp.rcu_all_qs
> 0.00 +0.1 0.07 ± 28% perf-profile.self.cycles-pp.raw_local_deliver
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> Thanks,
> Rong Chen
>
_______________________________________________
LKP mailing list
LKP@lists.01.org
https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply
* [iproute PATCH] utils.h: provide fallback CLOCK_TAI definition
From: Peter Korsgaard @ 2018-10-27 15:31 UTC (permalink / raw)
To: Stephen Hemminger, Vinicius Costa Gomes; +Cc: netdev, Peter Korsgaard
q_{etf,taprio}.c uses CLOCK_TAI, which isn't exposed by glibc < 2.21 or
uClibc, breaking the build. Provide a fallback definition like it is done
for IPPROTO_MPLS and others.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
---
include/utils.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/utils.h b/include/utils.h
index 258d630e..685d2c1d 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -126,6 +126,10 @@ struct ipx_addr {
#define IPPROTO_MPLS 137
#endif
+#ifndef CLOCK_TAI
+# define CLOCK_TAI 11
+#endif
+
__u32 get_addr32(const char *name);
int get_addr_1(inet_prefix *dst, const char *arg, int family);
int get_prefix_1(inet_prefix *dst, char *arg, int family);
--
2.11.0
^ permalink raw reply related
* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: David Miller @ 2018-10-28 0:03 UTC (permalink / raw)
To: tomasbortoli
Cc: vyasevich, nhorman, marcelo.leitner, linux-sctp, netdev,
linux-kernel
In-Reply-To: <cb41ca17-4bad-bd21-6938-aee960a8ba9b@gmail.com>
From: Tomas Bortoli <tomasbortoli@gmail.com>
Date: Sat, 27 Oct 2018 22:43:43 +0200
> I just realized we also have to check for less than 0 indexes..
How about the fact that your original submission didn't even compile?
I hope you realized that first.
^ permalink raw reply
* Re: checksumming on non-local forward path
From: Andrew Lunn @ 2018-10-27 14:26 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: Netdev
In-Reply-To: <CAHmME9rMhHwyiw0t+0oGS6XwPkmrbG_8TPmtWdS3aW9AFByphg@mail.gmail.com>
> What would you think of a flag on the receiving end like,
> "CHECKSUM_INVALID_BUT_UNNECESSARY"? It would be treated as
> CHECKSUM_UNNECESSARY in the case that the the packet is locally
> received. But if the packet is going to be forwarded instead, then
> skb_checksum_help is called on it before forwarding onward.
>
> AFAICT, wireguard isn't the only thing that could benefit from this:
> virtio is another case where it's not always necessary for the sender
> to call skb_checksum_help, when the receiver could just do it
> conditionally based on whether it's being forwarded.
Hi Jason
It is the sort of thing which breaks in hard to find ways. I've run
network simulations with machine instances running in containers. It
used veth pairs to connect the instances to a central 'switching'
namespace which did the interconnect between the instances, using lots
of bridges. After a while, my simulation got bigger than a single host
could support. So i split it over multiple servers, using GRE tunnels
between the bridges. It took me a while to notice the network was
actually in two segments, because frames going over GRE were getting
tossed with checksum issues. It was not the GRE tunnel at fault. It
took a while to trace it back to where the checksumming was turned
off, a TAP interface i think, but i don't remember.
How do you reliably decide if a frame needs checksums, when you cannot
peer down the pipe of bridges, veth, GRE tunnels and TAP interfaces
the frame is about to take?
Andrew
^ permalink raw reply
* [PATCH] bonding: fix length of actor system
From: Tobias Jungel @ 2018-10-27 13:31 UTC (permalink / raw)
To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek; +Cc: Eric Dumazet, netdev
The attribute IFLA_BOND_AD_ACTOR_SYSTEM is sent to user space having the
length of sizeof(bond->params.ad_actor_system) which is 8 byte. This
patch aligns the length to ETH_ALEN to have the same MAC address exposed
as using sysfs.
fixes f87fda00b6ed2
Signed-off-by: Tobias Jungel <tobias.jungel@gmail.com>
---
drivers/net/bonding/bond_netlink.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index 9697977b80f0..6b9ad8673218 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -638,8 +638,7 @@ static int bond_fill_info(struct sk_buff *skb,
goto nla_put_failure;
if (nla_put(skb, IFLA_BOND_AD_ACTOR_SYSTEM,
- sizeof(bond->params.ad_actor_system),
- &bond->params.ad_actor_system))
+ ETH_ALEN, &bond->params.ad_actor_system))
goto nla_put_failure;
}
if (!bond_3ad_get_active_agg_info(bond, &info)) {
^ permalink raw reply related
* [PATCH v2] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:53 UTC (permalink / raw)
To: vyasevich, nhorman, marcelo.leitner
Cc: davem, linux-sctp, netdev, linux-kernel, Tomas Bortoli
It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
net/sctp/socket.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..14dce5d95817 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,9 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
+ __SCTP_PR_INDEX(policy) < 0)
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7144,9 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
+ __SCTP_PR_INDEX(policy) < 0)
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
--
2.11.0
^ permalink raw reply related
* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: kbuild test robot @ 2018-10-27 20:50 UTC (permalink / raw)
To: Tomas Bortoli
Cc: kbuild-all, vyasevich, nhorman, marcelo.leitner, davem,
linux-sctp, netdev, linux-kernel, syzkaller, Tomas Bortoli
In-Reply-To: <20181027195853.30243-1-tomasbortoli@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4456 bytes --]
Hi Tomas,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
[also build test WARNING on v4.19 next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Tomas-Bortoli/sctp-socket-c-validate-sprstat_policy/20181028-040051
config: i386-randconfig-x075-201843 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from arch/x86/include/asm/atomic.h:5:0,
from include/linux/atomic.h:7,
from include/linux/crypto.h:20,
from include/crypto/hash.h:16,
from net/sctp/socket.c:55:
net/sctp/socket.c: In function 'sctp_getsockopt_pr_assocstatus':
net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
^~
net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:58:42: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
^~
net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:69:16: note: in definition of macro '__trace_if'
______r = !!(cond); \
^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
^~
vim +/if +7086 net/sctp/socket.c
7066
7067 static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
7068 char __user *optval,
7069 int __user *optlen)
7070 {
7071 struct sctp_prstatus params;
7072 struct sctp_association *asoc;
7073 int policy;
7074 int retval = -EINVAL;
7075
7076 if (len < sizeof(params))
7077 goto out;
7078
7079 len = sizeof(params);
7080 if (copy_from_user(¶ms, optval, len)) {
7081 retval = -EFAULT;
7082 goto out;
7083 }
7084
7085 policy = params.sprstat_policy;
> 7086 if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
7087 __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
7088 goto out;
7089
7090 asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
7091 if (!asoc)
7092 goto out;
7093
7094 if (policy & SCTP_PR_SCTP_ALL) {
7095 params.sprstat_abandoned_unsent = 0;
7096 params.sprstat_abandoned_sent = 0;
7097 for (policy = 0; policy <= SCTP_PR_INDEX(MAX); policy++) {
7098 params.sprstat_abandoned_unsent +=
7099 asoc->abandoned_unsent[policy];
7100 params.sprstat_abandoned_sent +=
7101 asoc->abandoned_sent[policy];
7102 }
7103 } else {
7104 params.sprstat_abandoned_unsent =
7105 asoc->abandoned_unsent[__SCTP_PR_INDEX(policy)];
7106 params.sprstat_abandoned_sent =
7107 asoc->abandoned_sent[__SCTP_PR_INDEX(policy)];
7108 }
7109
7110 if (put_user(len, optlen)) {
7111 retval = -EFAULT;
7112 goto out;
7113 }
7114
7115 if (copy_to_user(optval, ¶ms, len)) {
7116 retval = -EFAULT;
7117 goto out;
7118 }
7119
7120 retval = 0;
7121
7122 out:
7123 return retval;
7124 }
7125
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33915 bytes --]
^ permalink raw reply
* [PATCH] net/packet: support vhost mrg_rxbuf
From: Jianfeng Tan @ 2018-10-27 12:04 UTC (permalink / raw)
To: netdev; +Cc: davem, jasowang, mst
Previouly, virtio net header size is hardcoded to be 10, which makes
the feature mrg_rxbuf not available.
We redefine PACKET_VNET_HDR ioctl which treats user input as boolean,
but now as int, 0, 10, 12, or everything else be treated as 10.
There will be one case which is treated differently: if user input is
12, previously, the header size will be 10; but now it's 12.
Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
---
net/packet/af_packet.c | 97 ++++++++++++++++++++++++++----------------
net/packet/diag.c | 2 +-
net/packet/internal.h | 2 +-
3 files changed, 63 insertions(+), 38 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index ec3095f13aae..1bd7f4cdcc80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1999,18 +1999,24 @@ static unsigned int run_filter(struct sk_buff *skb,
}
static int packet_rcv_vnet(struct msghdr *msg, const struct sk_buff *skb,
- size_t *len)
+ size_t *len, int vnet_hdr_len)
{
+ int res;
struct virtio_net_hdr vnet_hdr;
- if (*len < sizeof(vnet_hdr))
+ if (*len < vnet_hdr_len)
return -EINVAL;
- *len -= sizeof(vnet_hdr);
+ *len -= vnet_hdr_len;
if (virtio_net_hdr_from_skb(skb, &vnet_hdr, vio_le(), true, 0))
return -EINVAL;
- return memcpy_to_msg(msg, (void *)&vnet_hdr, sizeof(vnet_hdr));
+ res = memcpy_to_msg(msg, (void *)&vnet_hdr, sizeof(vnet_hdr));
+ if (res == 0)
+ iov_iter_advance(&msg->msg_iter,
+ vnet_hdr_len - sizeof(vnet_hdr));
+
+ return res;
}
/*
@@ -2206,11 +2212,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
po->tp_reserve;
} else {
unsigned int maclen = skb_network_offset(skb);
+ int vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+
netoff = TPACKET_ALIGN(po->tp_hdrlen +
(maclen < 16 ? 16 : maclen)) +
po->tp_reserve;
- if (po->has_vnet_hdr) {
- netoff += sizeof(struct virtio_net_hdr);
+ if (vnet_hdr_sz) {
+ netoff += vnet_hdr_sz;
do_vnet = true;
}
macoff = netoff - maclen;
@@ -2429,19 +2437,6 @@ static int __packet_snd_vnet_parse(struct virtio_net_hdr *vnet_hdr, size_t len)
return 0;
}
-static int packet_snd_vnet_parse(struct msghdr *msg, size_t *len,
- struct virtio_net_hdr *vnet_hdr)
-{
- if (*len < sizeof(*vnet_hdr))
- return -EINVAL;
- *len -= sizeof(*vnet_hdr);
-
- if (!copy_from_iter_full(vnet_hdr, sizeof(*vnet_hdr), &msg->msg_iter))
- return -EFAULT;
-
- return __packet_snd_vnet_parse(vnet_hdr, *len);
-}
-
static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
void *frame, struct net_device *dev, void *data, int tp_len,
__be16 proto, unsigned char *addr, int hlen, int copylen,
@@ -2609,6 +2604,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
int len_sum = 0;
int status = TP_STATUS_AVAILABLE;
int hlen, tlen, copylen = 0;
+ int vnet_hdr_sz;
mutex_lock(&po->pg_vec_lock);
@@ -2648,7 +2644,8 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
size_max = po->tx_ring.frame_size
- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
- if ((size_max > dev->mtu + reserve + VLAN_HLEN) && !po->has_vnet_hdr)
+ vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+ if ((size_max > dev->mtu + reserve + VLAN_HLEN) && !vnet_hdr_sz)
size_max = dev->mtu + reserve + VLAN_HLEN;
do {
@@ -2668,10 +2665,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
status = TP_STATUS_SEND_REQUEST;
hlen = LL_RESERVED_SPACE(dev);
tlen = dev->needed_tailroom;
- if (po->has_vnet_hdr) {
+ if (vnet_hdr_sz) {
vnet_hdr = data;
- data += sizeof(*vnet_hdr);
- tp_len -= sizeof(*vnet_hdr);
+ data += vnet_hdr_sz;
+ tp_len -= vnet_hdr_sz;
if (tp_len < 0 ||
__packet_snd_vnet_parse(vnet_hdr, tp_len)) {
tp_len = -EINVAL;
@@ -2696,7 +2693,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
addr, hlen, copylen, &sockc);
if (likely(tp_len >= 0) &&
tp_len > dev->mtu + reserve &&
- !po->has_vnet_hdr &&
+ !vnet_hdr_sz &&
!packet_extra_vlan_len_allowed(dev, skb))
tp_len = -EMSGSIZE;
@@ -2715,7 +2712,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
}
}
- if (po->has_vnet_hdr) {
+ if (vnet_hdr_sz) {
if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) {
tp_len = -EINVAL;
goto tpacket_error;
@@ -2802,9 +2799,9 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
int err, reserve = 0;
struct sockcm_cookie sockc;
struct virtio_net_hdr vnet_hdr = { 0 };
+ int vnet_hdr_sz;
int offset = 0;
struct packet_sock *po = pkt_sk(sk);
- bool has_vnet_hdr = false;
int hlen, tlen, linear;
int extra_len = 0;
@@ -2844,11 +2841,29 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
if (sock->type == SOCK_RAW)
reserve = dev->hard_header_len;
- if (po->has_vnet_hdr) {
- err = packet_snd_vnet_parse(msg, &len, &vnet_hdr);
- if (err)
+
+ vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+ if (vnet_hdr_sz) {
+ if (len < vnet_hdr_sz) {
+ err = -EINVAL;
goto out_unlock;
- has_vnet_hdr = true;
+ }
+ len -= vnet_hdr_sz;
+
+ if (!copy_from_iter_full(&vnet_hdr, sizeof(vnet_hdr),
+ &msg->msg_iter)) {
+ err = -EFAULT;
+ goto out_unlock;
+ }
+
+ if (__packet_snd_vnet_parse(&vnet_hdr, len)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+
+ /* TODO: check hdr_len with len? */
+
+ iov_iter_advance(&msg->msg_iter, vnet_hdr_sz - sizeof(vnet_hdr));
}
if (unlikely(sock_flag(sk, SOCK_NOFCS))) {
@@ -2912,7 +2927,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
skb->mark = sockc.mark;
skb->tstamp = sockc.transmit_time;
- if (has_vnet_hdr) {
+ if (vnet_hdr_sz) {
err = virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le());
if (err)
goto out_free;
@@ -3307,11 +3322,11 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
if (pkt_sk(sk)->pressure)
packet_rcv_has_room(pkt_sk(sk), NULL);
- if (pkt_sk(sk)->has_vnet_hdr) {
- err = packet_rcv_vnet(msg, skb, &len);
+ vnet_hdr_len = READ_ONCE(pkt_sk(sk)->vnet_hdr_sz);
+ if (vnet_hdr_len) {
+ err = packet_rcv_vnet(msg, skb, &len, vnet_hdr_len);
if (err)
goto out_free;
- vnet_hdr_len = sizeof(struct virtio_net_hdr);
}
/* You lose any data beyond the buffer you gave. If it worries
@@ -3772,7 +3787,17 @@ packet_setsockopt(struct socket *sock, int level, int optname, char __user *optv
if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) {
ret = -EBUSY;
} else {
- po->has_vnet_hdr = !!val;
+ /* Previouly we treat user input as boolean (!!val),
+ * now we treat it as int. After the below correction,
+ * the only violation case is 12, which results in
+ * vnet header size of 12 instead of 10.
+ */
+ if (val &&
+ val != sizeof(struct virtio_net_hdr) &&
+ val != sizeof(struct virtio_net_hdr_mrg_rxbuf))
+ val = sizeof(struct virtio_net_hdr);
+
+ po->vnet_hdr_sz = val;
ret = 0;
}
release_sock(sk);
@@ -3903,7 +3928,7 @@ static int packet_getsockopt(struct socket *sock, int level, int optname,
val = po->origdev;
break;
case PACKET_VNET_HDR:
- val = po->has_vnet_hdr;
+ val = po->vnet_hdr_sz;
break;
case PACKET_VERSION:
val = po->tp_version;
diff --git a/net/packet/diag.c b/net/packet/diag.c
index 7ef1c881ae74..950015b6704f 100644
--- a/net/packet/diag.c
+++ b/net/packet/diag.c
@@ -26,7 +26,7 @@ static int pdiag_put_info(const struct packet_sock *po, struct sk_buff *nlskb)
pinfo.pdi_flags |= PDI_AUXDATA;
if (po->origdev)
pinfo.pdi_flags |= PDI_ORIGDEV;
- if (po->has_vnet_hdr)
+ if (po->vnet_hdr_sz)
pinfo.pdi_flags |= PDI_VNETHDR;
if (po->tp_loss)
pinfo.pdi_flags |= PDI_LOSS;
diff --git a/net/packet/internal.h b/net/packet/internal.h
index 3bb7c5fb3bff..11bc75950f28 100644
--- a/net/packet/internal.h
+++ b/net/packet/internal.h
@@ -115,9 +115,9 @@ struct packet_sock {
unsigned int running; /* bind_lock must be held */
unsigned int auxdata:1, /* writer must hold sock lock */
origdev:1,
- has_vnet_hdr:1,
tp_loss:1,
tp_tx_has_off:1;
+ int vnet_hdr_sz;
int pressure;
int ifindex; /* bound device */
__be16 num;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:43 UTC (permalink / raw)
To: vyasevich, nhorman, marcelo.leitner
Cc: davem, linux-sctp, netdev, linux-kernel
In-Reply-To: <20181027202026.32157-1-tomasbortoli@gmail.com>
On 10/27/18 10:20 PM, Tomas Bortoli wrote:
> It is possible to perform out-of-bound reads on
> sctp_getsockopt_pr_streamstatus() and on
> sctp_getsockopt_pr_assocstatus() by passing from userspace a
> sprstat_policy that overflows the abandoned_sent/abandoned_unsent
> fixed length arrays. The over-read data are directly copied/leaked
> to userspace.
>
> Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
> Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
> ---
> v2 - added forgot ||
>
> net/sctp/socket.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index fc0386e8ff23..5290b8bd40c8 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
> }
>
> policy = params.sprstat_policy;
> - if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> + if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> + __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
> goto out;
>
> asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
> @@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
> }
>
> policy = params.sprstat_policy;
> - if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> + if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> + __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
> goto out;
>
> asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
>
I just realized we also have to check for less than 0 indexes..
^ permalink raw reply
* [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:20 UTC (permalink / raw)
To: vyasevich, nhorman, marcelo.leitner
Cc: davem, linux-sctp, netdev, linux-kernel, Tomas Bortoli
It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
v2 - added forgot ||
net/sctp/socket.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..5290b8bd40c8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
--
2.11.0
^ permalink raw reply related
* [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 19:58 UTC (permalink / raw)
To: vyasevich, nhorman, marcelo.leitner
Cc: davem, linux-sctp, netdev, linux-kernel, syzkaller, Tomas Bortoli
It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
net/sctp/socket.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..5290b8bd40c8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
}
policy = params.sprstat_policy;
- if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+ if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+ __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
goto out;
asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
--
2.11.0
^ permalink raw reply related
* Re: CAKE and r8169 cause panic on upload in v4.19
From: Oleksandr Natalenko @ 2018-10-27 11:04 UTC (permalink / raw)
To: Dave Taht
Cc: hkallweit1, Toke Høiland-Jørgensen, David S. Miller,
Jamal Hadi Salim, Cong Wang, Jiří Pírko,
Linux Kernel Network Developers, linux-kernel
In-Reply-To: <CAA93jw4pvEYiXZ-C=yH1H_twZigSYAuEMrE_CCDjzLnVHwAdVA@mail.gmail.com>
Hi.
On 27.10.2018 01:08, Dave Taht wrote:
> Groovy. :whew:
>
> I do look forward to more cake test results, particularly on different
> network cards such as these, and at speeds higher than 10Gbit on high
> end hardware, and in the 100-1Gbit range on low to mid-range. After
> the last round of features added to cake before it went into linux, we
> run now out of cpu on inbound shaping at those speeds on low end apu2
> (x86) hardware, (atom and a15 chips are not so hot now either) and I
> wish I knew what we could do to speed it up. The new "list skb" and
> mirred code looked promising but we haven't got around to exploring it
> yet.
>
> Thank you for trying and I hope this gets sorted out on your chipset.
Yeah, but this is still strange. Both LAN computer and router run 4.19,
but only router panics. The LAN computer employs alx driver, router
employs r8169. Both had GRO enabled at the moment of panic. But [1]
reports that this happens with Intel NIC too, so must not be limited to
Realtek.
> We tend to use flent's rrul test to *really* abuse things. :)
>
> So cake's ok with gro disabled in hw?
Yes, I've gone back to CAKE but with GRO disabled for NIC, and it is
stable now. I've also asked a bug reporter [1] to do the same, so we
will see.
Thanks.
--
Oleksandr Natalenko (post-factum)
[1] https://bugzilla.kernel.org/show_bug.cgi?id=201063
^ permalink raw reply
* [PATCH] can: hi311x: Use level-triggered interrupt
From: Lukas Wunner @ 2018-10-27 8:36 UTC (permalink / raw)
To: Marc Kleine-Budde, Wolfgang Grandegger
Cc: Mathias Duckeck, Akshay Bhat, Casey Fitzpatrick, linux-can,
netdev
If the hi3110 shares the SPI bus with another traffic-intensive device
and packets are received in high volume (by a separate machine sending
with "cangen -g 0 -i -x"), reception stops after a few minutes and the
counter in /proc/interrupts stops incrementing. Bus state is "active".
Bringing the interface down and back up reconvenes the reception. The
issue is not observed when the hi3110 is the sole device on the SPI bus.
Using a level-triggered interrupt makes the issue go away and lets the
hi3110 successfully receive 2 GByte over the course of 5 days while a
ks8851 Ethernet chip on the same SPI bus handles 6 GByte of traffic.
Unfortunately the hi3110 datasheet is mum on the trigger type. The pin
description on page 3 only specifies the polarity (active high):
http://www.holtic.com/documents/371-hi-3110_v-rev-kpdf.do
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Akshay Bhat <akshay.bhat@timesys.com>
Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
Documentation/devicetree/bindings/net/can/holt_hi311x.txt | 2 +-
drivers/net/can/spi/hi311x.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/net/can/holt_hi311x.txt b/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
index 903a78da65be..3a9926f99937 100644
--- a/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
+++ b/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
@@ -17,7 +17,7 @@ Example:
reg = <1>;
clocks = <&clk32m>;
interrupt-parent = <&gpio4>;
- interrupts = <13 IRQ_TYPE_EDGE_RISING>;
+ interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
vdd-supply = <®5v0>;
xceiver-supply = <®5v0>;
};
diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c
index 53e320c92a8b..ddaf46239e39 100644
--- a/drivers/net/can/spi/hi311x.c
+++ b/drivers/net/can/spi/hi311x.c
@@ -760,7 +760,7 @@ static int hi3110_open(struct net_device *net)
{
struct hi3110_priv *priv = netdev_priv(net);
struct spi_device *spi = priv->spi;
- unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_RISING;
+ unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_HIGH;
int ret;
ret = open_candev(net);
--
2.19.1
^ permalink raw reply related
* [Patch V2 net 04/11] net: hns3: bugfix for the initialization of command queue's spin lock
From: Huazhong Tan @ 2018-10-27 8:10 UTC (permalink / raw)
To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>
The spin lock of the command queue only needs to be initialized once
when the driver initializes the command queue. It is not necessary to
initialize the spin lock when resetting. At the same time, the
modification of the queue member should be performed after acquiring
the lock.
Fixes: 3efb960f056d ("net: hns3: Refactor the initialization of command queue")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index ac13cb2..68026a5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -304,6 +304,10 @@ int hclge_cmd_queue_init(struct hclge_dev *hdev)
{
int ret;
+ /* Setup the lock for command queue */
+ spin_lock_init(&hdev->hw.cmq.csq.lock);
+ spin_lock_init(&hdev->hw.cmq.crq.lock);
+
/* Setup the queue entries for use cmd queue */
hdev->hw.cmq.csq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
hdev->hw.cmq.crq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
@@ -337,18 +341,20 @@ int hclge_cmd_init(struct hclge_dev *hdev)
u32 version;
int ret;
+ spin_lock_bh(&hdev->hw.cmq.csq.lock);
+ spin_lock_bh(&hdev->hw.cmq.crq.lock);
+
hdev->hw.cmq.csq.next_to_clean = 0;
hdev->hw.cmq.csq.next_to_use = 0;
hdev->hw.cmq.crq.next_to_clean = 0;
hdev->hw.cmq.crq.next_to_use = 0;
- /* Setup the lock for command queue */
- spin_lock_init(&hdev->hw.cmq.csq.lock);
- spin_lock_init(&hdev->hw.cmq.crq.lock);
-
hclge_cmd_init_regs(&hdev->hw);
clear_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
+ spin_unlock_bh(&hdev->hw.cmq.crq.lock);
+ spin_unlock_bh(&hdev->hw.cmq.csq.lock);
+
ret = hclge_cmd_query_firmware_version(&hdev->hw, &version);
if (ret) {
dev_err(&hdev->pdev->dev,
--
2.7.4
^ permalink raw reply related
* [Patch V2 net 02/11] net: hns3: add error handler for hns3_get_ring_config/hns3_queue_to_ring
From: Huazhong Tan @ 2018-10-27 8:10 UTC (permalink / raw)
To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>
When hns3_get_ring_config()/hns3_queue_to_ring() failed during resetting,
the allocated memory has not been freed before hns3_get_ring_config() and
hns3_queue_to_ring() return. So this patch fixes the buffer not freeing
problem during resetting.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index d9066c5..6f0fd62 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3037,8 +3037,10 @@ static int hns3_queue_to_ring(struct hnae3_queue *tqp,
return ret;
ret = hns3_ring_get_cfg(tqp, priv, HNAE3_RING_TYPE_RX);
- if (ret)
+ if (ret) {
+ devm_kfree(priv->dev, priv->ring_data[tqp->tqp_index].ring);
return ret;
+ }
return 0;
}
@@ -3047,7 +3049,7 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
{
struct hnae3_handle *h = priv->ae_handle;
struct pci_dev *pdev = h->pdev;
- int i, ret;
+ int i, j, ret;
priv->ring_data = devm_kzalloc(&pdev->dev,
array3_size(h->kinfo.num_tqps,
@@ -3065,6 +3067,12 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
return 0;
err:
+ for (j = i - 1; j >= 0; j--) {
+ devm_kfree(priv->dev, priv->ring_data[j].ring);
+ devm_kfree(priv->dev,
+ priv->ring_data[j + h->kinfo.num_tqps].ring);
+ }
+
devm_kfree(&pdev->dev, priv->ring_data);
return ret;
}
--
2.7.4
^ permalink raw reply related
* [Patch V2 net 11/11] net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()
From: Huazhong Tan @ 2018-10-27 8:10 UTC (permalink / raw)
To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>
Since hclgevf_reset_wait() is used to wait for the hardware to complete
the reset, it is not necessary to hold the rtnl_lock during
hclgevf_reset_wait(). So this patch release the lock for the duration
of hclgevf_reset_wait().
Fixes: 6988eb2a9b77 ("net: hns3: Add support to reset the enet/ring mgmt layer")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index b224f6a..085edb9 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1170,6 +1170,8 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
/* bring down the nic to stop any ongoing TX/RX */
hclgevf_notify_client(hdev, HNAE3_DOWN_CLIENT);
+ rtnl_unlock();
+
/* check if VF could successfully fetch the hardware reset completion
* status from the hardware
*/
@@ -1181,12 +1183,15 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
ret);
dev_warn(&hdev->pdev->dev, "VF reset failed, disabling VF!\n");
+ rtnl_lock();
hclgevf_notify_client(hdev, HNAE3_UNINIT_CLIENT);
rtnl_unlock();
return ret;
}
+ rtnl_lock();
+
/* now, re-initialize the nic client and ae device*/
ret = hclgevf_reset_stack(hdev);
if (ret)
--
2.7.4
^ permalink raw reply related
* [Patch V2 net 09/11] net: hns3: bugfix for handling mailbox while the command queue reinitialized
From: Huazhong Tan @ 2018-10-27 8:10 UTC (permalink / raw)
To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>
In a multi-core machine, the mailbox service and reset service
will be executed at the same time. The reset server will re-initialize
the commond queue, before that, the mailbox handler can only get some
invalid messages.
The HCLGE_STATE_CMD_DISABLE flag means that the command queue is not
available and needs to be reinitialized. Therefore, when the mailbox
hanlder recognizes this flag, it should not process the command.
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 04462a3..6ac2fab 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -400,6 +400,12 @@ void hclge_mbx_handler(struct hclge_dev *hdev)
/* handle all the mailbox requests in the queue */
while (!hclge_cmd_crq_empty(&hdev->hw)) {
+ if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
+ dev_warn(&hdev->pdev->dev,
+ "command queue need re-initialize\n");
+ return;
+ }
+
desc = &crq->desc[crq->next_to_use];
req = (struct hclge_mbx_vf_to_pf_cmd *)desc->data;
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox