* Re: [syzbot] KASAN: use-after-free Read in put_pmu_ctx
From: Peter Zijlstra @ 2022-12-20 8:22 UTC (permalink / raw)
To: sdf
Cc: syzbot, acme, alexander.shishkin, bpf, jolsa, linux-kernel,
linux-perf-users, mark.rutland, mingo, namhyung, netdev,
syzkaller-bugs
In-Reply-To: <Y6C8iQGENUk/XY/A@google.com>
On Mon, Dec 19, 2022 at 11:33:29AM -0800, sdf@google.com wrote:
> On 12/19, Peter Zijlstra wrote:
> > On Mon, Dec 19, 2022 at 12:04:43AM -0800, syzbot wrote:
> > > HEAD commit: 13e3c7793e2f Merge tag 'for-netdev' of
> > https://git.kernel...
> > > git tree: bpf
> > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=177df7e0480000
> > > kernel config:
> > https://syzkaller.appspot.com/x/.config?x=b0e91ad4b5f69c47
> > > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=b8e8c01c8ade4fe6e48f
^ so syzbot knows what tree and config were used to trigger the report,
then why:
> Let's maybe try it this way:
>
> #syz test: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
> 13e3c7793e2f
do you have to repeat that again in order for it to test something?
^ permalink raw reply
* [syzbot] KASAN: use-after-free Read in ovs_vport_locate
From: syzbot @ 2022-12-20 8:22 UTC (permalink / raw)
To: davem, dev, edumazet, kuba, linux-kernel, netdev, pabeni, pshelar,
syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 041fae9c105a Merge tag 'f2fs-for-6.2-rc1' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15c5d020480000
kernel config: https://syzkaller.appspot.com/x/.config?x=836aafbf33f4fa6c
dashboard link: https://syzkaller.appspot.com/bug?extid=8f4e2dcfcb3209ac35f9
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/30e749b24df4/disk-041fae9c.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/dd6d972f5b02/vmlinux-041fae9c.xz
kernel image: https://storage.googleapis.com/syzbot-assets/405163d7c7cc/bzImage-041fae9c.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8f4e2dcfcb3209ac35f9@syzkaller.appspotmail.com
netlink: 208 bytes leftover after parsing attributes in process `syz-executor.4'.
==================================================================
BUG: KASAN: use-after-free in read_pnet include/net/net_namespace.h:383 [inline]
BUG: KASAN: use-after-free in ovs_dp_get_net net/openvswitch/datapath.h:195 [inline]
BUG: KASAN: use-after-free in ovs_vport_locate+0x131/0x150 net/openvswitch/vport.c:103
Read of size 8 at addr ffff88802055e360 by task syz-executor.4/5621
CPU: 0 PID: 5621 Comm: syz-executor.4 Not tainted 6.1.0-syzkaller-10971-g041fae9c105a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:306 [inline]
print_report+0x15e/0x461 mm/kasan/report.c:417
kasan_report+0xbf/0x1f0 mm/kasan/report.c:517
read_pnet include/net/net_namespace.h:383 [inline]
ovs_dp_get_net net/openvswitch/datapath.h:195 [inline]
ovs_vport_locate+0x131/0x150 net/openvswitch/vport.c:103
lookup_datapath+0x54/0x3a0 net/openvswitch/datapath.c:1628
ovs_dp_reset_user_features net/openvswitch/datapath.c:1639 [inline]
ovs_dp_cmd_new+0xd5b/0x11c0 net/openvswitch/datapath.c:1848
genl_family_rcv_msg_doit.isra.0+0x1e6/0x2d0 net/netlink/genetlink.c:968
genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
genl_rcv_msg+0x4ff/0x7e0 net/netlink/genetlink.c:1065
netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2564
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1356
netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1932
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2476
___sys_sendmsg+0x110/0x1b0 net/socket.c:2530
__sys_sendmsg+0xf7/0x1c0 net/socket.c:2559
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f142348c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f14240ff168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f14235abf80 RCX: 00007f142348c0d9
RDX: 0000000000000800 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00007f14234e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffdd965a34f R14: 00007f14240ff300 R15: 0000000000022000
</TASK>
Allocated by task 5564:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:371 [inline]
____kasan_kmalloc mm/kasan/common.c:330 [inline]
__kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380
kmalloc include/linux/slab.h:580 [inline]
kzalloc include/linux/slab.h:720 [inline]
ovs_dp_cmd_new+0x1a3/0x11c0 net/openvswitch/datapath.c:1796
genl_family_rcv_msg_doit.isra.0+0x1e6/0x2d0 net/netlink/genetlink.c:968
genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
genl_rcv_msg+0x4ff/0x7e0 net/netlink/genetlink.c:1065
netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2564
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1356
netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1932
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2476
___sys_sendmsg+0x110/0x1b0 net/socket.c:2530
__sys_sendmsg+0xf7/0x1c0 net/socket.c:2559
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 5564:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:177 [inline]
__cache_free mm/slab.c:3394 [inline]
__do_kmem_cache_free mm/slab.c:3580 [inline]
__kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587
ovs_dp_cmd_new+0x25e/0x11c0 net/openvswitch/datapath.c:1884
genl_family_rcv_msg_doit.isra.0+0x1e6/0x2d0 net/netlink/genetlink.c:968
genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
genl_rcv_msg+0x4ff/0x7e0 net/netlink/genetlink.c:1065
netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2564
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1356
netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1932
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2476
___sys_sendmsg+0x110/0x1b0 net/socket.c:2530
__sys_sendmsg+0xf7/0x1c0 net/socket.c:2559
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Last potentially related work creation:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0x7b/0x90 mm/kasan/generic.c:488
insert_work+0x48/0x350 kernel/workqueue.c:1358
__queue_work+0x693/0x13b0 kernel/workqueue.c:1517
queue_work_on+0xf2/0x110 kernel/workqueue.c:1545
queue_work include/linux/workqueue.h:503 [inline]
addr_event.part.0+0x33e/0x4f0 drivers/infiniband/core/roce_gid_mgmt.c:853
addr_event drivers/infiniband/core/roce_gid_mgmt.c:824 [inline]
inet6addr_event+0x142/0x1c0 drivers/infiniband/core/roce_gid_mgmt.c:883
notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
atomic_notifier_call_chain+0x74/0x180 kernel/notifier.c:225
ipv6_add_addr+0x1266/0x1de0 net/ipv6/addrconf.c:1165
addrconf_add_linklocal+0x1cc/0x590 net/ipv6/addrconf.c:3215
addrconf_addr_gen+0x326/0x370 net/ipv6/addrconf.c:3346
addrconf_dev_config+0x255/0x410 net/ipv6/addrconf.c:3391
addrconf_notify+0xfb6/0x1c80 net/ipv6/addrconf.c:3635
notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1944
netdev_state_change net/core/dev.c:1319 [inline]
netdev_state_change+0x104/0x130 net/core/dev.c:1312
linkwatch_do_dev+0x10e/0x150 net/core/link_watch.c:182
__linkwatch_run_queue+0x23f/0x6a0 net/core/link_watch.c:235
linkwatch_event+0x4e/0x70 net/core/link_watch.c:278
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x669/0x1090 kernel/workqueue.c:2436
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
Second to last potentially related work creation:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0x7b/0x90 mm/kasan/generic.c:488
insert_work+0x48/0x350 kernel/workqueue.c:1358
__queue_work+0x693/0x13b0 kernel/workqueue.c:1517
queue_work_on+0xf2/0x110 kernel/workqueue.c:1545
queue_work include/linux/workqueue.h:503 [inline]
netdevice_queue_work drivers/infiniband/core/roce_gid_mgmt.c:659 [inline]
netdevice_event+0x5e9/0x8f0 drivers/infiniband/core/roce_gid_mgmt.c:802
notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1944
call_netdevice_notifiers_extack net/core/dev.c:1982 [inline]
call_netdevice_notifiers net/core/dev.c:1996 [inline]
register_netdevice+0xfb4/0x1640 net/core/dev.c:10078
bond_newlink drivers/net/bonding/bond_netlink.c:560 [inline]
bond_newlink+0x4b/0xa0 drivers/net/bonding/bond_netlink.c:550
rtnl_newlink_create net/core/rtnetlink.c:3407 [inline]
__rtnl_newlink+0x10c2/0x1840 net/core/rtnetlink.c:3624
rtnl_newlink+0x68/0xa0 net/core/rtnetlink.c:3637
rtnetlink_rcv_msg+0x43e/0xca0 net/core/rtnetlink.c:6141
netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2564
netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1356
netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1932
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
__sys_sendto+0x23a/0x340 net/socket.c:2117
__do_sys_sendto net/socket.c:2129 [inline]
__se_sys_sendto net/socket.c:2125 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff88802055e300
which belongs to the cache kmalloc-192 of size 192
The buggy address is located 96 bytes inside of
192-byte region [ffff88802055e300, ffff88802055e3c0)
The buggy address belongs to the physical page:
page:ffffea0000815780 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802055ef00 pfn:0x2055e
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 ffff888012040000 ffffea0000873fd0 ffffea000083f490
raw: ffff88802055ef00 ffff88802055e000 000000010000000e 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x2420c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE), pid 1, tgid 1 (swapper/0), ts 8303233753, free_ts 8269599359
prep_new_page mm/page_alloc.c:2531 [inline]
get_page_from_freelist+0x119c/0x2ce0 mm/page_alloc.c:4283
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
__alloc_pages_node include/linux/gfp.h:237 [inline]
kmem_getpages mm/slab.c:1363 [inline]
cache_grow_begin+0x94/0x390 mm/slab.c:2574
cache_alloc_refill+0x27f/0x380 mm/slab.c:2947
____cache_alloc mm/slab.c:3023 [inline]
____cache_alloc mm/slab.c:3006 [inline]
__do_cache_alloc mm/slab.c:3206 [inline]
slab_alloc_node mm/slab.c:3254 [inline]
__kmem_cache_alloc_node+0x44f/0x510 mm/slab.c:3544
kmalloc_trace+0x26/0x60 mm/slab_common.c:1062
kmalloc include/linux/slab.h:580 [inline]
kzalloc include/linux/slab.h:720 [inline]
call_usermodehelper_setup+0x9c/0x340 kernel/umh.c:366
kobject_uevent_env+0xed3/0x1620 lib/kobject_uevent.c:614
device_add+0xb76/0x1e90 drivers/base/core.c:3498
rfkill_register+0x1a9/0xb00 net/rfkill/core.c:1070
wiphy_register+0x24ae/0x2ae0 net/wireless/core.c:1007
virt_wifi_make_wiphy drivers/net/wireless/virt_wifi.c:383 [inline]
virt_wifi_init_module+0x352/0x3da drivers/net/wireless/virt_wifi.c:665
do_one_initcall+0x141/0x790 init/main.c:1306
do_initcall_level init/main.c:1379 [inline]
do_initcalls init/main.c:1395 [inline]
do_basic_setup init/main.c:1414 [inline]
kernel_init_freeable+0x6f9/0x782 init/main.c:1634
kernel_init+0x1e/0x1d0 init/main.c:1522
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1446 [inline]
free_pcp_prepare+0x65c/0xc00 mm/page_alloc.c:1496
free_unref_page_prepare mm/page_alloc.c:3369 [inline]
free_unref_page+0x1d/0x490 mm/page_alloc.c:3464
__vunmap+0x85d/0xd30 mm/vmalloc.c:2727
free_work+0x5c/0x80 mm/vmalloc.c:100
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x669/0x1090 kernel/workqueue.c:2436
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
Memory state around the buggy address:
ffff88802055e200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88802055e280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88802055e300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88802055e380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88802055e400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply
* [PATCH] ath10k: snoc: enable threaded napi on WCN3990
From: Abhishek Kumar @ 2022-12-20 7:55 UTC (permalink / raw)
To: kvalo
Cc: ath10k, linux-wireless, kuabhs, linux-kernel, netdev,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
NAPI poll can be done in threaded context along with soft irq
context. Threaded context can be scheduled efficiently, thus
creating less of bottleneck during Rx processing. This patch is
to enable threaded NAPI on ath10k driver.
Based on testing, it was observed that on WCN3990, the CPU0 reaches
100% utilization when napi runs in softirq context. At the same
time the other CPUs are at low consumption percentage. This
does not allow device to reach its maximum throughput potential.
After enabling threaded napi, CPU load is balanced across all CPUs
and following improvments were observed:
- UDP_RX increase by ~22-25%
- TCP_RX increase by ~15%
Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1
Signed-off-by: Abhishek Kumar <kuabhs@chromium.org>
---
drivers/net/wireless/ath/ath10k/core.c | 16 ++++++++++++++++
drivers/net/wireless/ath/ath10k/hw.h | 2 ++
drivers/net/wireless/ath/ath10k/snoc.c | 3 +++
3 files changed, 21 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 5eb131ab916fd..ee4b6ba508c81 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -100,6 +100,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA988X_HW_2_0_VERSION,
@@ -140,6 +141,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9887_HW_1_0_VERSION,
@@ -181,6 +183,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA6174_HW_3_2_VERSION,
@@ -217,6 +220,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA6174_HW_2_1_VERSION,
@@ -257,6 +261,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA6174_HW_2_1_VERSION,
@@ -297,6 +302,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA6174_HW_3_0_VERSION,
@@ -337,6 +343,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA6174_HW_3_2_VERSION,
@@ -381,6 +388,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA99X0_HW_2_0_DEV_VERSION,
@@ -427,6 +435,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9984_HW_1_0_DEV_VERSION,
@@ -480,6 +489,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9888_HW_2_0_DEV_VERSION,
@@ -530,6 +540,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9377_HW_1_0_DEV_VERSION,
@@ -570,6 +581,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9377_HW_1_1_DEV_VERSION,
@@ -612,6 +624,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA9377_HW_1_1_DEV_VERSION,
@@ -645,6 +658,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = QCA4019_HW_1_0_DEV_VERSION,
@@ -692,6 +706,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = false,
.use_fw_tx_credits = true,
.delay_unmap_buffer = false,
+ .enable_threaded_napi = false,
},
{
.id = WCN3990_HW_1_0_DEV_VERSION,
@@ -725,6 +740,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
.hw_restart_disconnect = true,
.use_fw_tx_credits = false,
.delay_unmap_buffer = true,
+ .enable_threaded_napi = true,
},
};
diff --git a/drivers/net/wireless/ath/ath10k/hw.h b/drivers/net/wireless/ath/ath10k/hw.h
index 9643031a4427a..adf3076b96503 100644
--- a/drivers/net/wireless/ath/ath10k/hw.h
+++ b/drivers/net/wireless/ath/ath10k/hw.h
@@ -639,6 +639,8 @@ struct ath10k_hw_params {
bool use_fw_tx_credits;
bool delay_unmap_buffer;
+
+ bool enable_threaded_napi;
};
struct htt_resp;
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index cfcb759a87dea..b94150fb6ef06 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -927,6 +927,9 @@ static int ath10k_snoc_hif_start(struct ath10k *ar)
bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX);
+ if (ar->hw_params.enable_threaded_napi)
+ dev_set_threaded(&ar->napi_dev, true);
+
ath10k_core_napi_enable(ar);
ath10k_snoc_irq_enable(ar);
ath10k_snoc_rx_post(ar);
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related
* Re: [PATCH v2] iavf/iavf_main: actually log ->src mask when talking about it
From: Michal Swiatkowski @ 2022-12-20 7:50 UTC (permalink / raw)
To: Daniil Tatianin
Cc: Jesse Brandeburg, Tony Nguyen, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Harshitha Ramamurthy, Jeff Kirsher, intel-wired-lan,
netdev, linux-kernel
In-Reply-To: <20221220063246.1593327-1-d-tatianin@yandex-team.ru>
On Tue, Dec 20, 2022 at 09:32:46AM +0300, Daniil Tatianin wrote:
> This fixes a copy-paste issue where dev_err would log the dst mask even
> though it is clearly talking about src.
>
> Found by Linux Verification Center (linuxtesting.org) with the SVACE
> static analysis tool.
>
> Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters")
> Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
> ---
> drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index c4e451ef7942..adc02adef83a 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -3850,7 +3850,7 @@ static int iavf_parse_cls_flower(struct iavf_adapter *adapter,
> field_flags |= IAVF_CLOUD_FIELD_IIP;
> } else {
> dev_err(&adapter->pdev->dev, "Bad ip src mask 0x%08x\n",
> - be32_to_cpu(match.mask->dst));
> + be32_to_cpu(match.mask->src));
> return -EINVAL;
> }
> }
> --
> 2.25.1
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
It is good practise to include changelog in message when You send
another version. For example:
v1:
* change fix tag to 12 chars
^ permalink raw reply
* [RFC PATCH v5 4/4] test/vsock: vsock_perf utility
From: Arseniy Krasnov @ 2022-12-20 7:23 UTC (permalink / raw)
To: Stefano Garzarella, David S. Miller, edumazet@google.com,
Jakub Kicinski, Paolo Abeni
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
kernel, Bobby Eshleman, Krasnov Arseniy, Arseniy Krasnov
In-Reply-To: <e04f749e-f1a7-9a1d-8213-c633ffcc0a69@sberdevices.ru>
This adds utility to check vsock rx/tx performance.
Usage as sender:
./vsock_perf --sender <cid> --port <port> --bytes <bytes to send>
Usage as receiver:
./vsock_perf --port <port> --rcvlowat <SO_RCVLOWAT>
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
---
tools/testing/vsock/Makefile | 3 +-
tools/testing/vsock/README | 34 +++
tools/testing/vsock/vsock_perf.c | 441 +++++++++++++++++++++++++++++++
3 files changed, 477 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/vsock/vsock_perf.c
diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile
index f8293c6910c9..43a254f0e14d 100644
--- a/tools/testing/vsock/Makefile
+++ b/tools/testing/vsock/Makefile
@@ -1,8 +1,9 @@
# SPDX-License-Identifier: GPL-2.0-only
-all: test
+all: test vsock_perf
test: vsock_test vsock_diag_test
vsock_test: vsock_test.o timeout.o control.o util.o
vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o
+vsock_perf: vsock_perf.o
CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE
.PHONY: all test clean
diff --git a/tools/testing/vsock/README b/tools/testing/vsock/README
index 4d5045e7d2c3..e6f6735bba05 100644
--- a/tools/testing/vsock/README
+++ b/tools/testing/vsock/README
@@ -35,3 +35,37 @@ Invoke test binaries in both directions as follows:
--control-port=$GUEST_IP \
--control-port=1234 \
--peer-cid=3
+
+vsock_perf utility
+-------------------
+'vsock_perf' is a simple tool to measure vsock performance. It works in
+sender/receiver modes: sender connect to peer at the specified port and
+starts data transmission to the receiver. After data processing is done,
+it prints several metrics(see below).
+
+Usage:
+# run as sender
+# connect to CID 2, port 1234, send 1G of data, tx buf size is 1M
+./vsock_perf --sender 2 --port 1234 --bytes 1G --buf-size 1M
+
+Output:
+tx performance: A Gb/s
+
+Output explanation:
+A is calculated as "number of bytes to send" / "time in tx loop"
+
+# run as receiver
+# listen port 1234, rx buf size is 1M, socket buf size is 1G, SO_RCVLOWAT is 64K
+./vsock_perf --port 1234 --buf-size 1M --vsk-size 1G --rcvlowat 64K
+
+Output:
+rx performance: A Gb/s
+total in 'read()': B sec
+POLLIN wakeups: C
+average in 'read()': D ns
+
+Output explanation:
+A is calculated as "number of received bytes" / "time in rx loop".
+B is time, spent in 'read()' system call(excluding 'poll()')
+C is number of 'poll()' wake ups with POLLIN bit set.
+D is B / C, e.g. average amount of time, spent in single 'read()'.
diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c
new file mode 100644
index 000000000000..ccd595462b40
--- /dev/null
+++ b/tools/testing/vsock/vsock_perf.c
@@ -0,0 +1,441 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vsock_perf - benchmark utility for vsock.
+ *
+ * Copyright (C) 2022 SberDevices.
+ *
+ * Author: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
+ */
+#include <getopt.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#include <time.h>
+#include <stdint.h>
+#include <poll.h>
+#include <sys/socket.h>
+#include <linux/vm_sockets.h>
+
+#define DEFAULT_BUF_SIZE_BYTES (128 * 1024)
+#define DEFAULT_TO_SEND_BYTES (64 * 1024)
+#define DEFAULT_VSOCK_BUF_BYTES (256 * 1024)
+#define DEFAULT_RCVLOWAT_BYTES 1
+#define DEFAULT_PORT 1234
+
+#define BYTES_PER_GB (1024 * 1024 * 1024ULL)
+#define NSEC_PER_SEC (1000000000ULL)
+
+static unsigned int port = DEFAULT_PORT;
+static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES;
+static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES;
+
+static inline time_t current_nsec(void)
+{
+ struct timespec ts;
+
+ if (clock_gettime(CLOCK_REALTIME, &ts)) {
+ perror("clock_gettime");
+ exit(EXIT_FAILURE);
+ }
+
+ return (ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec;
+}
+
+/* From lib/cmdline.c. */
+static unsigned long memparse(const char *ptr)
+{
+ char *endptr;
+
+ unsigned long long ret = strtoull(ptr, &endptr, 0);
+
+ switch (*endptr) {
+ case 'E':
+ case 'e':
+ ret <<= 10;
+ case 'P':
+ case 'p':
+ ret <<= 10;
+ case 'T':
+ case 't':
+ ret <<= 10;
+ case 'G':
+ case 'g':
+ ret <<= 10;
+ case 'M':
+ case 'm':
+ ret <<= 10;
+ case 'K':
+ case 'k':
+ ret <<= 10;
+ endptr++;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static void vsock_increase_buf_size(int fd)
+{
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
+ perror("setsockopt(SO_VM_SOCKETS_BUFFER_MAX_SIZE)");
+ exit(EXIT_FAILURE);
+ }
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
+ perror("setsockopt(SO_VM_SOCKETS_BUFFER_SIZE)");
+ exit(EXIT_FAILURE);
+ }
+}
+
+static int vsock_connect(unsigned int cid, unsigned int port)
+{
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } addr = {
+ .svm = {
+ .svm_family = AF_VSOCK,
+ .svm_port = port,
+ .svm_cid = cid,
+ },
+ };
+ int fd;
+
+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+
+ if (fd < 0) {
+ perror("socket");
+ return -1;
+ }
+
+ if (connect(fd, &addr.sa, sizeof(addr.svm)) < 0) {
+ perror("connect");
+ close(fd);
+ return -1;
+ }
+
+ return fd;
+}
+
+static float get_gbps(unsigned long bits, time_t ns_delta)
+{
+ return ((float)bits / 1000000000ULL) /
+ ((float)ns_delta / NSEC_PER_SEC);
+}
+
+static void run_receiver(unsigned long rcvlowat_bytes)
+{
+ unsigned int read_cnt;
+ time_t rx_begin_ns;
+ time_t in_read_ns;
+ size_t total_recv;
+ int client_fd;
+ char *data;
+ int fd;
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } addr = {
+ .svm = {
+ .svm_family = AF_VSOCK,
+ .svm_port = port,
+ .svm_cid = VMADDR_CID_ANY,
+ },
+ };
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } clientaddr;
+
+ socklen_t clientaddr_len = sizeof(clientaddr.svm);
+
+ printf("Run as receiver\n");
+ printf("Listen port %u\n", port);
+ printf("RX buffer %lu bytes\n", buf_size_bytes);
+ printf("vsock buffer %lu bytes\n", vsock_buf_bytes);
+ printf("SO_RCVLOWAT %lu bytes\n", rcvlowat_bytes);
+
+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+
+ if (fd < 0) {
+ perror("socket");
+ exit(EXIT_FAILURE);
+ }
+
+ if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
+ perror("bind");
+ exit(EXIT_FAILURE);
+ }
+
+ if (listen(fd, 1) < 0) {
+ perror("listen");
+ exit(EXIT_FAILURE);
+ }
+
+ client_fd = accept(fd, &clientaddr.sa, &clientaddr_len);
+
+ if (client_fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ vsock_increase_buf_size(client_fd);
+
+ if (setsockopt(client_fd, SOL_SOCKET, SO_RCVLOWAT,
+ &rcvlowat_bytes,
+ sizeof(rcvlowat_bytes))) {
+ perror("setsockopt(SO_RCVLOWAT)");
+ exit(EXIT_FAILURE);
+ }
+
+ data = malloc(buf_size_bytes);
+
+ if (!data) {
+ fprintf(stderr, "'malloc()' failed\n");
+ exit(EXIT_FAILURE);
+ }
+
+ read_cnt = 0;
+ in_read_ns = 0;
+ total_recv = 0;
+ rx_begin_ns = current_nsec();
+
+ while (1) {
+ struct pollfd fds = { 0 };
+
+ fds.fd = client_fd;
+ fds.events = POLLIN | POLLERR |
+ POLLHUP | POLLRDHUP;
+
+ if (poll(&fds, 1, -1) < 0) {
+ perror("poll");
+ exit(EXIT_FAILURE);
+ }
+
+ if (fds.revents & POLLERR) {
+ fprintf(stderr, "'poll()' error\n");
+ exit(EXIT_FAILURE);
+ }
+
+ if (fds.revents & POLLIN) {
+ ssize_t bytes_read;
+ time_t t;
+
+ t = current_nsec();
+ bytes_read = read(fds.fd, data, buf_size_bytes);
+ in_read_ns += (current_nsec() - t);
+ read_cnt++;
+
+ if (!bytes_read)
+ break;
+
+ if (bytes_read < 0) {
+ perror("read");
+ exit(EXIT_FAILURE);
+ }
+
+ total_recv += bytes_read;
+ }
+
+ if (fds.revents & (POLLHUP | POLLRDHUP))
+ break;
+ }
+
+ printf("total bytes received: %zu\n", total_recv);
+ printf("rx performance: %f Gbits/s\n",
+ get_gbps(total_recv * 8, current_nsec() - rx_begin_ns));
+ printf("total time in 'read()': %f sec\n", (float)in_read_ns / NSEC_PER_SEC);
+ printf("average time in 'read()': %f ns\n", (float)in_read_ns / read_cnt);
+ printf("POLLIN wakeups: %i\n", read_cnt);
+
+ free(data);
+ close(client_fd);
+ close(fd);
+}
+
+static void run_sender(int peer_cid, unsigned long to_send_bytes)
+{
+ time_t tx_begin_ns;
+ time_t tx_total_ns;
+ size_t total_send;
+ void *data;
+ int fd;
+
+ printf("Run as sender\n");
+ printf("Connect to %i:%u\n", peer_cid, port);
+ printf("Send %lu bytes\n", to_send_bytes);
+ printf("TX buffer %lu bytes\n", buf_size_bytes);
+
+ fd = vsock_connect(peer_cid, port);
+
+ if (fd < 0)
+ exit(EXIT_FAILURE);
+
+ data = malloc(buf_size_bytes);
+
+ if (!data) {
+ fprintf(stderr, "'malloc()' failed\n");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(data, 0, buf_size_bytes);
+ total_send = 0;
+ tx_begin_ns = current_nsec();
+
+ while (total_send < to_send_bytes) {
+ ssize_t sent;
+
+ sent = write(fd, data, buf_size_bytes);
+
+ if (sent <= 0) {
+ perror("write");
+ exit(EXIT_FAILURE);
+ }
+
+ total_send += sent;
+ }
+
+ tx_total_ns = current_nsec() - tx_begin_ns;
+
+ printf("total bytes sent: %zu\n", total_send);
+ printf("tx performance: %f Gbits/s\n",
+ get_gbps(total_send * 8, tx_total_ns));
+ printf("total time in 'write()': %f sec\n",
+ (float)tx_total_ns / NSEC_PER_SEC);
+
+ close(fd);
+ free(data);
+}
+
+static const char optstring[] = "";
+static const struct option longopts[] = {
+ {
+ .name = "help",
+ .has_arg = no_argument,
+ .val = 'H',
+ },
+ {
+ .name = "sender",
+ .has_arg = required_argument,
+ .val = 'S',
+ },
+ {
+ .name = "port",
+ .has_arg = required_argument,
+ .val = 'P',
+ },
+ {
+ .name = "bytes",
+ .has_arg = required_argument,
+ .val = 'M',
+ },
+ {
+ .name = "buf-size",
+ .has_arg = required_argument,
+ .val = 'B',
+ },
+ {
+ .name = "vsk-size",
+ .has_arg = required_argument,
+ .val = 'V',
+ },
+ {
+ .name = "rcvlowat",
+ .has_arg = required_argument,
+ .val = 'R',
+ },
+ {},
+};
+
+static void usage(void)
+{
+ printf("Usage: ./vsock_perf [--help] [options]\n"
+ "\n"
+ "This is benchmarking utility, to test vsock performance.\n"
+ "It runs in two modes: sender or receiver. In sender mode, it\n"
+ "connects to the specified CID and starts data transmission.\n"
+ "\n"
+ "Options:\n"
+ " --help This message\n"
+ " --sender <cid> Sender mode (receiver default)\n"
+ " <cid> of the receiver to connect to\n"
+ " --port <port> Port (default %d)\n"
+ " --bytes <bytes>KMG Bytes to send (default %d)\n"
+ " --buf-size <bytes>KMG Data buffer size (default %d). In sender mode\n"
+ " it is the buffer size, passed to 'write()'. In\n"
+ " receiver mode it is the buffer size passed to 'read()'.\n"
+ " --vsk-size <bytes>KMG Socket buffer size (default %d)\n"
+ " --rcvlowat <bytes>KMG SO_RCVLOWAT value (default %d)\n"
+ "\n", DEFAULT_PORT, DEFAULT_TO_SEND_BYTES,
+ DEFAULT_BUF_SIZE_BYTES, DEFAULT_VSOCK_BUF_BYTES,
+ DEFAULT_RCVLOWAT_BYTES);
+ exit(EXIT_FAILURE);
+}
+
+static long strtolx(const char *arg)
+{
+ long value;
+ char *end;
+
+ value = strtol(arg, &end, 10);
+
+ if (end != arg + strlen(arg))
+ usage();
+
+ return value;
+}
+
+int main(int argc, char **argv)
+{
+ unsigned long to_send_bytes = DEFAULT_TO_SEND_BYTES;
+ unsigned long rcvlowat_bytes = DEFAULT_RCVLOWAT_BYTES;
+ int peer_cid = -1;
+ bool sender = false;
+
+ while (1) {
+ int opt = getopt_long(argc, argv, optstring, longopts, NULL);
+
+ if (opt == -1)
+ break;
+
+ switch (opt) {
+ case 'V': /* Peer buffer size. */
+ vsock_buf_bytes = memparse(optarg);
+ break;
+ case 'R': /* SO_RCVLOWAT value. */
+ rcvlowat_bytes = memparse(optarg);
+ break;
+ case 'P': /* Port to connect to. */
+ port = strtolx(optarg);
+ break;
+ case 'M': /* Bytes to send. */
+ to_send_bytes = memparse(optarg);
+ break;
+ case 'B': /* Size of rx/tx buffer. */
+ buf_size_bytes = memparse(optarg);
+ break;
+ case 'S': /* Sender mode. CID to connect to. */
+ peer_cid = strtolx(optarg);
+ sender = true;
+ break;
+ case 'H': /* Help. */
+ usage();
+ break;
+ default:
+ usage();
+ }
+ }
+
+ if (!sender)
+ run_receiver(rcvlowat_bytes);
+ else
+ run_sender(peer_cid, to_send_bytes);
+
+ return 0;
+}
--
2.25.1
^ permalink raw reply related
* [RFC PATCH v5 3/4] test/vsock: add big message test
From: Arseniy Krasnov @ 2022-12-20 7:22 UTC (permalink / raw)
To: Stefano Garzarella, Jakub Kicinski, Paolo Abeni,
edumazet@google.com, David S. Miller
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
kernel, Arseniy Krasnov, Krasnov Arseniy, Bobby Eshleman
In-Reply-To: <e04f749e-f1a7-9a1d-8213-c633ffcc0a69@sberdevices.ru>
This adds test for sending message, bigger than peer's buffer size.
For SOCK_SEQPACKET socket it must fail, as this type of socket has
message size limit.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---
tools/testing/vsock/vsock_test.c | 69 ++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 26c38ad9d07b..67e9f9df3a8c 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -569,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts)
close(fd);
}
+static void test_seqpacket_bigmsg_client(const struct test_opts *opts)
+{
+ unsigned long sock_buf_size;
+ ssize_t send_size;
+ socklen_t len;
+ void *data;
+ int fd;
+
+ len = sizeof(sock_buf_size);
+
+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &sock_buf_size, &len)) {
+ perror("getsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ sock_buf_size++;
+
+ data = malloc(sock_buf_size);
+ if (!data) {
+ perror("malloc");
+ exit(EXIT_FAILURE);
+ }
+
+ send_size = send(fd, data, sock_buf_size, 0);
+ if (send_size != -1) {
+ fprintf(stderr, "expected 'send(2)' failure, got %zi\n",
+ send_size);
+ exit(EXIT_FAILURE);
+ }
+
+ if (errno != EMSGSIZE) {
+ fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n",
+ errno);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("CLISENT");
+
+ free(data);
+ close(fd);
+}
+
+static void test_seqpacket_bigmsg_server(const struct test_opts *opts)
+{
+ int fd;
+
+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("CLISENT");
+
+ close(fd);
+}
+
#define BUF_PATTERN_1 'a'
#define BUF_PATTERN_2 'b'
@@ -851,6 +915,11 @@ static struct test_case test_cases[] = {
.run_client = test_stream_poll_rcvlowat_client,
.run_server = test_stream_poll_rcvlowat_server,
},
+ {
+ .name = "SOCK_SEQPACKET big message",
+ .run_client = test_seqpacket_bigmsg_client,
+ .run_server = test_seqpacket_bigmsg_server,
+ },
{},
};
--
2.25.1
^ permalink raw reply related
* [RFC PATCH v5 2/4] test/vsock: rework message bounds test
From: Arseniy Krasnov @ 2022-12-20 7:20 UTC (permalink / raw)
To: Stefano Garzarella, David S. Miller, Jakub Kicinski, Paolo Abeni,
edumazet@google.com
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
kernel, Krasnov Arseniy, Arseniy Krasnov, Bobby Eshleman
In-Reply-To: <e04f749e-f1a7-9a1d-8213-c633ffcc0a69@sberdevices.ru>
This updates message bound test making it more complex. Instead of
sending 1 bytes messages with one MSG_EOR bit, it sends messages of
random length(one half of messages are smaller than page size, second
half are bigger) with random number of MSG_EOR bits set. Receiver
also don't know total number of messages.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---
tools/testing/vsock/control.c | 28 +++++++
tools/testing/vsock/control.h | 2 +
tools/testing/vsock/util.c | 13 ++++
tools/testing/vsock/util.h | 1 +
tools/testing/vsock/vsock_test.c | 128 +++++++++++++++++++++++++++----
5 files changed, 157 insertions(+), 15 deletions(-)
diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c
index 4874872fc5a3..d2deb4b15b94 100644
--- a/tools/testing/vsock/control.c
+++ b/tools/testing/vsock/control.c
@@ -141,6 +141,34 @@ void control_writeln(const char *str)
timeout_end();
}
+void control_writeulong(unsigned long value)
+{
+ char str[32];
+
+ if (snprintf(str, sizeof(str), "%lu", value) >= sizeof(str)) {
+ perror("snprintf");
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln(str);
+}
+
+unsigned long control_readulong(void)
+{
+ unsigned long value;
+ char *str;
+
+ str = control_readln();
+
+ if (!str)
+ exit(EXIT_FAILURE);
+
+ value = strtoul(str, NULL, 10);
+ free(str);
+
+ return value;
+}
+
/* Return the next line from the control socket (without the trailing newline).
*
* The program terminates if a timeout occurs.
diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h
index 51814b4f9ac1..c1f77fdb2c7a 100644
--- a/tools/testing/vsock/control.h
+++ b/tools/testing/vsock/control.h
@@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port,
void control_cleanup(void);
void control_writeln(const char *str);
char *control_readln(void);
+unsigned long control_readulong(void);
void control_expectln(const char *str);
bool control_cmpln(char *line, const char *str, bool fail);
+void control_writeulong(unsigned long value);
#endif /* CONTROL_H */
diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 2acbb7703c6a..01b636d3039a 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -395,3 +395,16 @@ void skip_test(struct test_case *test_cases, size_t test_cases_len,
test_cases[test_id].skip = true;
}
+
+unsigned long hash_djb2(const void *data, size_t len)
+{
+ unsigned long hash = 5381;
+ int i = 0;
+
+ while (i < len) {
+ hash = ((hash << 5) + hash) + ((unsigned char *)data)[i];
+ i++;
+ }
+
+ return hash;
+}
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index a3375ad2fb7f..fb99208a95ea 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -49,4 +49,5 @@ void run_tests(const struct test_case *test_cases,
void list_tests(const struct test_case *test_cases);
void skip_test(struct test_case *test_cases, size_t test_cases_len,
const char *test_id_str);
+unsigned long hash_djb2(const void *data, size_t len);
#endif /* UTIL_H */
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index bb6d691cb30d..26c38ad9d07b 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -284,10 +284,14 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
close(fd);
}
-#define MESSAGES_CNT 7
-#define MSG_EOR_IDX (MESSAGES_CNT / 2)
+#define SOCK_BUF_SIZE (2 * 1024 * 1024)
+#define MAX_MSG_SIZE (32 * 1024)
+
static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
{
+ unsigned long curr_hash;
+ int page_size;
+ int msg_count;
int fd;
fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
@@ -296,18 +300,79 @@ static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
exit(EXIT_FAILURE);
}
- /* Send several messages, one with MSG_EOR flag */
- for (int i = 0; i < MESSAGES_CNT; i++)
- send_byte(fd, 1, (i == MSG_EOR_IDX) ? MSG_EOR : 0);
+ /* Wait, until receiver sets buffer size. */
+ control_expectln("SRVREADY");
+
+ curr_hash = 0;
+ page_size = getpagesize();
+ msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE;
+
+ for (int i = 0; i < msg_count; i++) {
+ ssize_t send_size;
+ size_t buf_size;
+ int flags;
+ void *buf;
+
+ /* Use "small" buffers and "big" buffers. */
+ if (i & 1)
+ buf_size = page_size +
+ (rand() % (MAX_MSG_SIZE - page_size));
+ else
+ buf_size = 1 + (rand() % page_size);
+
+ buf = malloc(buf_size);
+
+ if (!buf) {
+ perror("malloc");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(buf, rand() & 0xff, buf_size);
+ /* Set at least one MSG_EOR + some random. */
+ if (i == (msg_count / 2) || (rand() & 1)) {
+ flags = MSG_EOR;
+ curr_hash++;
+ } else {
+ flags = 0;
+ }
+
+ send_size = send(fd, buf, buf_size, flags);
+
+ if (send_size < 0) {
+ perror("send");
+ exit(EXIT_FAILURE);
+ }
+
+ if (send_size != buf_size) {
+ fprintf(stderr, "Invalid send size\n");
+ exit(EXIT_FAILURE);
+ }
+
+ /*
+ * Hash sum is computed at both client and server in
+ * the same way:
+ * H += hash('message data')
+ * Such hash "controls" both data integrity and message
+ * bounds. After data exchange, both sums are compared
+ * using control socket, and if message bounds wasn't
+ * broken - two values must be equal.
+ */
+ curr_hash += hash_djb2(buf, buf_size);
+ free(buf);
+ }
control_writeln("SENDDONE");
+ control_writeulong(curr_hash);
close(fd);
}
static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
{
+ unsigned long sock_buf_size;
+ unsigned long remote_hash;
+ unsigned long curr_hash;
int fd;
- char buf[16];
+ char buf[MAX_MSG_SIZE];
struct msghdr msg = {0};
struct iovec iov = {0};
@@ -317,25 +382,57 @@ static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
exit(EXIT_FAILURE);
}
+ sock_buf_size = SOCK_BUF_SIZE;
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
+ &sock_buf_size, sizeof(sock_buf_size))) {
+ perror("setsockopt(SO_VM_SOCKETS_BUFFER_MAX_SIZE)");
+ exit(EXIT_FAILURE);
+ }
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &sock_buf_size, sizeof(sock_buf_size))) {
+ perror("setsockopt(SO_VM_SOCKETS_BUFFER_SIZE)");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Ready to receive data. */
+ control_writeln("SRVREADY");
+ /* Wait, until peer sends whole data. */
control_expectln("SENDDONE");
iov.iov_base = buf;
iov.iov_len = sizeof(buf);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
- for (int i = 0; i < MESSAGES_CNT; i++) {
- if (recvmsg(fd, &msg, 0) != 1) {
- perror("message bound violated");
- exit(EXIT_FAILURE);
- }
+ curr_hash = 0;
- if ((i == MSG_EOR_IDX) ^ !!(msg.msg_flags & MSG_EOR)) {
- perror("MSG_EOR");
+ while (1) {
+ ssize_t recv_size;
+
+ recv_size = recvmsg(fd, &msg, 0);
+
+ if (!recv_size)
+ break;
+
+ if (recv_size < 0) {
+ perror("recvmsg");
exit(EXIT_FAILURE);
}
+
+ if (msg.msg_flags & MSG_EOR)
+ curr_hash++;
+
+ curr_hash += hash_djb2(msg.msg_iov[0].iov_base, recv_size);
}
close(fd);
+ remote_hash = control_readulong();
+
+ if (curr_hash != remote_hash) {
+ fprintf(stderr, "Message bounds broken\n");
+ exit(EXIT_FAILURE);
+ }
}
#define MESSAGE_TRUNC_SZ 32
@@ -427,7 +524,7 @@ static void test_seqpacket_timeout_client(const struct test_opts *opts)
tv.tv_usec = 0;
if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (void *)&tv, sizeof(tv)) == -1) {
- perror("setsockopt 'SO_RCVTIMEO'");
+ perror("setsockopt(SO_RCVTIMEO)");
exit(EXIT_FAILURE);
}
@@ -644,7 +741,7 @@ static void test_stream_poll_rcvlowat_client(const struct test_opts *opts)
if (setsockopt(fd, SOL_SOCKET, SO_RCVLOWAT,
&lowat_val, sizeof(lowat_val))) {
- perror("setsockopt");
+ perror("setsockopt(SO_RCVLOWAT)");
exit(EXIT_FAILURE);
}
@@ -837,6 +934,7 @@ int main(int argc, char **argv)
.peer_cid = VMADDR_CID_ANY,
};
+ srand(time(NULL));
init_signals();
for (;;) {
--
2.25.1
^ permalink raw reply related
* [RFC PATCH v5 1/4] vsock: return errors other than -ENOMEM to socket
From: Arseniy Krasnov @ 2022-12-20 7:18 UTC (permalink / raw)
To: Stefano Garzarella, David S. Miller, edumazet@google.com,
Paolo Abeni, Jakub Kicinski
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org, kernel, Bobby Eshleman,
Krasnov Arseniy, Arseniy Krasnov
In-Reply-To: <e04f749e-f1a7-9a1d-8213-c633ffcc0a69@sberdevices.ru>
This removes behaviour, where error code returned from any transport
was always switched to ENOMEM. For example when user tries to send too
big message via SEQPACKET socket, transport layers return EMSGSIZE, but
this error code was always replaced with ENOMEM and returned to user.
Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
---
net/vmw_vsock/af_vsock.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index d593d5b6d4b1..19aea7cba26e 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1861,8 +1861,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
written = transport->stream_enqueue(vsk,
msg, len - total_written);
}
+
if (written < 0) {
- err = -ENOMEM;
+ err = written;
goto out_err;
}
--
2.25.1
^ permalink raw reply related
* [RFC PATCH v5 0/4] vsock: update tools and error handling
From: Arseniy Krasnov @ 2022-12-20 7:16 UTC (permalink / raw)
To: Stefano Garzarella, David S. Miller, edumazet@google.com,
Paolo Abeni, Jakub Kicinski
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
kernel, Bobby Eshleman, Krasnov Arseniy, Arseniy Krasnov
Patchset consists of two parts:
1) Kernel patch
One patch from Bobby Eshleman. I took single patch from Bobby:
https://lore.kernel.org/lkml/d81818b868216c774613dd03641fcfe63cc55a45
.1660362668.git.bobby.eshleman@bytedance.com/ and use only part for
af_vsock.c, as VMCI and Hyper-V parts were rejected.
I used it, because for SOCK_SEQPACKET big messages handling was broken -
ENOMEM was returned instead of EMSGSIZE. And anyway, current logic which
always replaces any error code returned by transport to ENOMEM looks
strange for me also(for example in EMSGSIZE case it was changed to
ENOMEM).
2) Tool patches
Since there is work on several significant updates for vsock(virtio/
vsock especially): skbuff, DGRAM, zerocopy rx/tx, so I think that this
patchset will be useful.
This patchset updates vsock tests and tools a little bit. First of all
it updates test suite: two new tests are added. One test is reworked
message bound test. Now it is more complex. Instead of sending 1 byte
messages with one MSG_EOR bit, it sends messages of random length(one
half of messages are smaller than page size, second half are bigger)
with random number of MSG_EOR bits set. Receiver also don't know total
number of messages. Message bounds control is maintained by hash sum
of messages length calculation. Second test is for SOCK_SEQPACKET - it
tries to send message with length more than allowed. I think both tests
will be useful for DGRAM support also.
Third thing that this patchset adds is small utility to test vsock
performance for both rx and tx. I think this util could be useful as
'iperf'/'uperf', because:
1) It is small comparing to 'iperf' or 'uperf', so it very easy to add
new mode or feature to it(especially vsock specific).
2) It allows to set SO_RCVLOWAT and SO_VM_SOCKETS_BUFFER_SIZE option.
Whole throughtput depends on both parameters.
3) It is located in the kernel source tree, so it could be updated by
the same patchset which changes related kernel functionality in vsock.
I used this util very often to check performance of my rx zerocopy
support(this tool has rx zerocopy support, but not in this patchset).
Here is comparison of outputs from three utils: 'iperf', 'uperf' and
'vsock_perf'. In all three cases sender was at guest side. rx and
tx buffers were always 64Kb(because by default 'uperf' uses 8K).
iperf:
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 12.8 GBytes 11.0 Gbits/sec sender
[ 5] 0.00-10.00 sec 12.8 GBytes 11.0 Gbits/sec receiver
uperf:
Total 16.27GB / 11.36(s) = 12.30Gb/s 23455op/s
vsock_perf:
tx performance: 12.301529 Gbits/s
rx performance: 12.288011 Gbits/s
Results are almost same in all three cases.
Patchset was rebased and tested on skbuff v8 patch from Bobby Eshleman:
https://lore.kernel.org/netdev/20221215043645.3545127-1-bobby.eshleman@bytedance.com/
Changelog:
v4 -> v5:
- Kernel patch: update commit message
- vsock_perf:
- Fix typo in commit message
- Use "fprintf(stderr," instead of "printf(" for errors
- More stats for tx: total bytes sent and time in tx loop
- Print throughput in 'gigabits' instead of 'gigabytes'(as in
'iperf' and 'uperf')
- Output comparisons between 'iperf', 'uperf' and 'vsock_perf'
added to CV.
v3 -> v4:
- Kernel patch: update commit message by adding error case description
- Message bounds test:
- Typo fix: s/contols/controls
- Fix error output on 'setsockopt()'s
- vsock_perf:
- Add 'vsock_perf' target to 'all' in Makefile
- Fix error output on 'setsockopt()'s
- Swap sender/receiver roles: now sender does 'connect()' and sends
data, while receiver accepts connection.
- Update arguments names: s/mb/bytes, s/so_rcvlowat/rcvlowat
- Update usage output and description in README
v2 -> v3:
- Patches for VMCI and Hyper-V were removed from patchset(commented by
Vishnu Dasa and Dexuan Cui)
- In message bounds test hash is computed from data buffer with random
content(in v2 it was size only). This approach controls both data
integrity and message bounds.
- vsock_perf:
- grammar fixes
- only long parameters supported(instead of only short)
v1 -> v2:
- Three new patches from Bobby Eshleman to kernel part
- Message bounds test: some refactoring and add comment to describe
hashing purpose
- Big message test: check 'errno' for EMSGSIZE and move new test to
the end of tests array
- vsock_perf:
- update README file
- add simple usage example to commit message
- update '-h' (help) output
- use 'stdout' for output instead of 'stderr'
- use 'strtol' instead of 'atoi'
Bobby Eshleman(1):
vsock: return errors other than -ENOMEM to socket
Arseniy Krasnov(3):
test/vsock: rework message bound test
test/vsock: add big message test
test/vsock: vsock_perf utility
net/vmw_vsock/af_vsock.c | 3 +-
tools/testing/vsock/Makefile | 3 +-
tools/testing/vsock/README | 34 +++
tools/testing/vsock/control.c | 28 +++
tools/testing/vsock/control.h | 2 +
tools/testing/vsock/util.c | 13 ++
tools/testing/vsock/util.h | 1 +
tools/testing/vsock/vsock_perf.c | 441 +++++++++++++++++++++++++++++++++++++++
tools/testing/vsock/vsock_test.c | 197 +++++++++++++++--
9 files changed, 705 insertions(+), 17 deletions(-)
--
2.25.1
^ permalink raw reply
* Re: [RFC PATCH v1 0/2] virtio/vsock: fix mutual rx/tx hungup
From: Arseniy Krasnov @ 2022-12-20 7:14 UTC (permalink / raw)
To: Stefano Garzarella
Cc: Stefan Hajnoczi, edumazet@google.com, David S. Miller,
Jakub Kicinski, Paolo Abeni, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, virtualization@lists.linux-foundation.org,
kvm@vger.kernel.org, kernel, Krasnov Arseniy
In-Reply-To: <CAGxU2F4ca5pxW3RX4wzsTx3KRBtxLK_rO9KxPgUtqcaSNsqXCA@mail.gmail.com>
On 19.12.2022 18:41, Stefano Garzarella wrote:
Hello!
> Hi Arseniy,
>
> On Sat, Dec 17, 2022 at 8:42 PM Arseniy Krasnov <AVKrasnov@sberdevices.ru> wrote:
>>
>> Hello,
>>
>> seems I found strange thing(may be a bug) where sender('tx' later) and
>> receiver('rx' later) could stuck forever. Potential fix is in the first
>> patch, second patch contains reproducer, based on vsock test suite.
>> Reproducer is simple: tx just sends data to rx by 'write() syscall, rx
>> dequeues it using 'read()' syscall and uses 'poll()' for waiting. I run
>> server in host and client in guest.
>>
>> rx side params:
>> 1) SO_VM_SOCKETS_BUFFER_SIZE is 256Kb(e.g. default).
>> 2) SO_RCVLOWAT is 128Kb.
>>
>> What happens in the reproducer step by step:
>>
>
> I put the values of the variables involved to facilitate understanding:
>
> RX: buf_alloc = 256 KB; fwd_cnt = 0; last_fwd_cnt = 0;
> free_space = buf_alloc - (fwd_cnt - last_fwd_cnt) = 256 KB
>
> The credit update is sent if
> free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE [64 KB]
>
>> 1) tx tries to send 256Kb + 1 byte (in a single 'write()')
>> 2) tx sends 256Kb, data reaches rx (rx_bytes == 256Kb)
>> 3) tx waits for space in 'write()' to send last 1 byte
>> 4) rx does poll(), (rx_bytes >= rcvlowat) 256Kb >= 128Kb, POLLIN is set
>> 5) rx reads 64Kb, credit update is not sent due to *
>
> RX: buf_alloc = 256 KB; fwd_cnt = 64 KB; last_fwd_cnt = 0;
> free_space = 192 KB
>
>> 6) rx does poll(), (rx_bytes >= rcvlowat) 192Kb >= 128Kb, POLLIN is set
>> 7) rx reads 64Kb, credit update is not sent due to *
>
> RX: buf_alloc = 256 KB; fwd_cnt = 128 KB; last_fwd_cnt = 0;
> free_space = 128 KB
>
>> 8) rx does poll(), (rx_bytes >= rcvlowat) 128Kb >= 128Kb, POLLIN is set
>> 9) rx reads 64Kb, credit update is not sent due to *
>
> Right, (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) is still false.
>
> RX: buf_alloc = 256 KB; fwd_cnt = 196 KB; last_fwd_cnt = 0;
> free_space = 64 KB
>
>> 10) rx does poll(), (rx_bytes < rcvlowat) 64Kb < 128Kb, rx waits in poll()
>
> I agree that the TX is stuck because we are not sending the credit
> update, but also if RX sends the credit update at step 9, RX won't be
> woken up at step 10, right?
Yes, RX will sleep, but TX will wake up and as we inform TX how much
free space we have, now there are two cases for TX:
1) send "small" rest of data(e.g. without blocking again), leave 'write()'
and continue execution. RX still waits in 'poll()'. Later TX will
send enough data to wake up RX.
2) send "big" rest of data - if rest is too big to leave 'write()' and TX
will wait again for the free space - it will be able to send enough data
to wake up RX as we compared 'rx_bytes' with rcvlowat value in RX.
>
>>
>> * is optimization in 'virtio_transport_stream_do_dequeue()' which
>> sends OP_CREDIT_UPDATE only when we have not too much space -
>> less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
>>
>> Now tx side waits for space inside write() and rx waits in poll() for
>> 'rx_bytes' to reach SO_RCVLOWAT value. Both sides will wait forever. I
>> think, possible fix is to send credit update not only when we have too
>> small space, but also when number of bytes in receive queue is smaller
>> than SO_RCVLOWAT thus not enough to wake up sleeping reader. I'm not
>> sure about correctness of this idea, but anyway - I think that problem
>> above exists. What do You think?
>
> I'm not sure, I have to think more about it, but if RX reads less than
> SO_RCVLOWAT, I expect it's normal to get to a case of stuck.
>
> In this case we are only unstucking TX, but even if it sends that single
> byte, RX is still stuck and not consuming it, so it was useless to wake
> up TX if RX won't consume it anyway, right?
1) I think it is not useless, because we inform(not just wake up) TX that
there is free space at RX side - as i mentioned above.
2) Anyway i think that this situation is a little bit strange: TX thinks that
there is no free space at RX and waits for it, but there is free space at RX!
At the same time, RX waits in poll() forever - it is ready to get new portion
of data to return POLLIN, but TX "thinks" exactly opposite thing - RX is full
of data. Of course, if there will be just stalls in TX data handling - it will
be ok - just performance degradation, but TX stucks forever.
>
> If RX woke up (e.g. SO_RCVLOWAT = 64KB) and read the remaining 64KB,
> then it would still send the credit update even without this patch and
> TX will send the 1 byte.
But how RX will wake up in this case? E.g. it calls poll() without timeout,
connection is established, RX ignores signal
Thanks, Arseniy
>
> Thanks,
> Stefano
>
^ permalink raw reply
* Re: [PATCH v2 3/4] sched/isolation: Add HK_TYPE_WQ to isolcpus=domain
From: Leonardo Brás @ 2022-12-20 6:57 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, Steffen Klassert, Herbert Xu, David S. Miller,
Bjorn Helgaas, Ingo Molnar, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, Valentin Schneider, Tejun Heo,
Lai Jiangshan, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Phil Auld, Antoine Tenart, Christophe JAILLET, Wang Yufen,
mtosatti, linux-crypto, linux-kernel, linux-pci, netdev, fweisbec
In-Reply-To: <20221129121051.GB1715045@lothringen>
On Tue, 2022-11-29 at 13:10 +0100, Frederic Weisbecker wrote:
> On Fri, Oct 14, 2022 at 01:27:25PM -0300, Leonardo Brás wrote:
> > Hello Frederic,
> >
> > So, IIUC you are removing all flags composing nohz_full= parameter in favor of a
> > unified NOHZ_FULL flag.
> >
> > I am very new to the code, and I am probably missing the whole picture, but I
> > actually think it's a good approach to keep them split for a couple reasons:
> > 1 - They are easier to understand in code (IMHO):
> > "This cpu should not do this, because it's not able to do WQ housekeeping" looks
> > better than "because it's not in DOMAIN or NOHZ_FULL housekeeping"
>
> A comment above each site may solve that.
Sure, but not having to leave comments would be better. Or am I missing
something?
>
> >
> > 2 - They are simpler for using:
> > Suppose we have this function that should run at a WQ, but we want to keep them
> > out of the isolated cpus. If we have the unified flags, we need to combine both
> > DOMAIN and NOHZ_FULL bitmasks, and then combine it again with something like
> > cpu_online_mask. It usually means allocating a new cpumask_t, and also freeing
> > it afterwards.
> > If we have a single WQ flag, we can avoid the allocation altogether by using
> > for_each_cpu_and(), making the code much simpler.
>
> I guess having a specific function for workqueues would arrange for it.
You mean keeping a WQ housekeeping bitmap? This could be a solution, but it
would affect only the WQ example.
>
> >
> > 3 - It makes easier to compose new isolation modes:
> > In case the future requires a new isolation mode that also uses the types of
> > isolation we currently have implemented, it would be much easier to just compose
> > it with the current HK flags, instead of having to go through all usages and do
> > a cpumask_and() there. Also, new isolation modes would make (2) worse.
>
> Actually having a new feature merged in HK_NOHZ_FULL would make it easier to
> handle as it avoids spreading cpumasks. I'm not sure I understand what you
> mean.
IIUC, your queued patch merges the housekeeping types HK_TYPE_TIMER,
HK_TYPE_RCU, HK_TYPE_MISC, HK_TYPE_TICK, HK_TYPE_WQ and HK_TYPE_KTHREAD in a
single HK_TYPE_NOHZ_FULL.
Suppose in future we need a new isolation feature in cmdline, say
isol_new=<cpulist>, and it works exactly like nohz_full=<cpulist>, but also
needs to isolate cpulist against something else, say doing X.
How would this get implemented? IIUC, following the same pattern:
- A new type HK_TYPE_ISOL_NEW would be created together with a cpumask,
- The new cpumask would be used to keep cpulist from doing X
- All places that use HK_TYPE_NOHZ_FULL bitmap for isolation would need to also
bitmask_and() the new cpumask. (sometimes needing a local cpumask_t)
Ok, there may be shortcuts for this, like keeping an intermediary bitmap, but
that can become tricky.
Other more complex example: New isolation feature isol_new2=<cpulist> behaves
like nohz_full=<cpulist>, keeps cpulist from doing X, but allows unbound RCU
work. Now it's even harder to have shortcuts from previous implementation.
What I am trying to defend here is that keeping the HK_type with the idea of
"things to get cpulist isolated from" works better for future implementations
than a single flag with a lot of responsibilities:
- A new type HK_TYPE_X would be created together with a cpumask,
- The new cpumask would be used to keep cpulist from doing X
- isol_new=<cpulist> is composed with the flags for what cpulist is getting
isolated.
- (No need to touch already implemented isolations.)
In fact, I propose that it works better for current implementations also:
The current patch (3/4) takes the WQ isolation responsibility from
HK_TYPE_DOMAIN and focus it in HK_TYPE_WQ, adding it to isolcpus=<cpulist>
flags. This avoids some cpumask_and()s, and a cpumask_t kzalloc, and makes the
code less complex to implement when we need to put isolation in further parts of
the code. (patch 4/4)
I am not sure if I am missing some important point here.
Please let me know if it's the case.
>
> Thanks.
>
Thank you for replying!
Leo
^ permalink raw reply
* Re: [PATCH v2 4/9] dt-bindings: net: Add bindings for StarFive dwmac
From: yanhong wang @ 2022-12-20 6:57 UTC (permalink / raw)
To: Krzysztof Kozlowski, linux-riscv, netdev, devicetree,
linux-kernel
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <a8e09f78-704f-13f0-15ad-6c6dca6997f3@linaro.org>
On 2022/12/16 19:06, Krzysztof Kozlowski wrote:
> On 16/12/2022 08:06, Yanhong Wang wrote:
>> Add documentation to describe StarFive dwmac driver(GMAC).
>>
>
> Subject: drop second, redundant "bindings for".
>
Thanks, i will fix.
> Best regards,
> Krzysztof
>
^ permalink raw reply
* Re: [PATCH v2 4/9] dt-bindings: net: Add bindings for StarFive dwmac
From: yanhong wang @ 2022-12-20 6:53 UTC (permalink / raw)
To: Krzysztof Kozlowski, linux-riscv, netdev, devicetree,
linux-kernel
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <2da394a6-411a-ca2b-90d3-7e97f3637d9f@linaro.org>
On 2022/12/16 19:05, Krzysztof Kozlowski wrote:
> On 16/12/2022 08:06, Yanhong Wang wrote:
>> Add documentation to describe StarFive dwmac driver(GMAC).
>>
>> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
>> ---
>> .../devicetree/bindings/net/snps,dwmac.yaml | 1 +
>> .../bindings/net/starfive,jh71x0-dwmac.yaml | 103 ++++++++++++++++++
>> MAINTAINERS | 5 +
>> 3 files changed, 109 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> index 7870228b4cd3..cdb045d1c618 100644
>> --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> @@ -91,6 +91,7 @@ properties:
>> - snps,dwmac-5.20
>> - snps,dwxgmac
>> - snps,dwxgmac-2.10
>> + - starfive,jh7110-dwmac
>>
>> reg:
>> minItems: 1
>> diff --git a/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
>> new file mode 100644
>> index 000000000000..5cb1272fe959
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
>> @@ -0,0 +1,103 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +# Copyright (C) 2022 StarFive Technology Co., Ltd.
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/net/starfive,jh71x0-dwmac.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: StarFive JH71x0 DWMAC glue layer
>> +
>> +maintainers:
>> + - Yanhong Wang <yanhong.wang@starfivetech.com>
>> +
>> +select:
>> + properties:
>> + compatible:
>> + contains:
>> + enum:
>> + - starfive,jh7110-dwmac
>> + required:
>> + - compatible
>> +
>> +allOf:
>> + - $ref: snps,dwmac.yaml#
>> +
>> +properties:
>> + compatible:
>> + items:
>> + - enum:
>> + - starfive,jh7110-dwmac
>
> Is it going to grow with new models? If yes, when? If not, filename does
> not match compatible.
I will update the filename in the next version.
>
>> + - const: snps,dwmac-5.20
>> +
>> + clocks:
>> + items:
>> + - description: GMAC main clock
>> + - description: GMAC AHB clock
>> + - description: PTP clock
>> + - description: TX clock
>> + - description: GTXC clock
>> + - description: GTX clock
>> +
>> + clock-names:
>> + items:
>> + - const: stmmaceth
>> + - const: pclk
>> + - const: ptp_ref
>> + - const: tx
>> + - const: gtxc
>> + - const: gtx
>
> missing resets and reset-names.
>
I will add resets and reset-names in the next version.
>> +
>> +required:
>> + - compatible
>> + - clocks
>> + - clock-names
>> + - resets
>> + - reset-names
>> +
>> +unevaluatedProperties: false
>> +
> Best regards,
> Krzysztof
>
^ permalink raw reply
* Re: [PATCH v2 2/9] dt-bindings: net: snps,dwmac: Update the maxitems number of resets and reset-names
From: yanhong wang @ 2022-12-20 6:48 UTC (permalink / raw)
To: Krzysztof Kozlowski, linux-riscv, netdev, devicetree,
linux-kernel
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <040b56b1-c65c-34c3-e4a1-5cae4428d1d2@linaro.org>
On 2022/12/16 19:03, Krzysztof Kozlowski wrote:
> On 16/12/2022 08:06, Yanhong Wang wrote:
>> Some boards(such as StarFive VisionFive v2) require more than one value
>> which defined by resets property, so the original definition can not
>> meet the requirements. In order to adapt to different requirements,
>> adjust the maxitems number from 1 to 3..
>>
>> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
>> ---
>> .../devicetree/bindings/net/snps,dwmac.yaml | 15 +++++++++++----
>> 1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> index e26c3e76ebb7..7870228b4cd3 100644
>> --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
>> @@ -133,12 +133,19 @@ properties:
>> - ptp_ref
>>
>> resets:
>> - maxItems: 1
>> - description:
>> - MAC Reset signal.
>> + minItems: 1
>> + maxItems: 3
>> + additionalItems: true
>> + items:
>> + - description: MAC Reset signal
>>
>> reset-names:
>> - const: stmmaceth
>> + minItems: 1
>> + maxItems: 3
>> + additionalItems: true
>> + contains:
>> + enum:
>> + - stmmaceth
>
> No, this is highly unspecific and you know affect all the schemas using
> snps,dwmac.yaml. Both lists must be specific - for your device and for
> others.
>
I have tried to define the resets in "starfive,jh71x0-dwmac.yaml", but it can not over-write the maxItems limit in "snps,dwmac.yaml",therefore, it will report error "reset-names: ['stmmaceth', 'ahb'] is too long" running "make dt_binding_check". Do you have any suggestions to deal with this situation?
> Best regards,
> Krzysztof
>
^ permalink raw reply
* Re: [PATCH 0/4] Fix probe failed when modprobe modules
From: Jason Wang @ 2022-12-20 6:44 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Li Zetao, pbonzini, stefanha, axboe, kraxel, david, ericvh, lucho,
asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
virtualization, linux-block, linux-kernel, v9fs-developer, netdev
In-Reply-To: <20221219050716-mutt-send-email-mst@kernel.org>
On Mon, Dec 19, 2022 at 6:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Nov 29, 2022 at 11:37:09AM +0800, Jason Wang wrote:
> > >
> > >
> > > Quite a lot of core work here. Jason are you still looking into
> > > hardening?
> >
> > Yes, last time we've discussed a solution that depends on the first
> > kick to enable the interrupt handler. But after some thought, it seems
> > risky since there's no guarantee that the device work in this way.
> >
> > One example is the current vhost_net, it doesn't wait for the kick to
> > process the rx packets. Any more thought on this?
> >
> > Thanks
>
> Specifically virtio net is careful to call virtio_device_ready
> under rtnl lock so buffers are only added after DRIVER_OK.
Right but it only got fixed this year after some code audit.
>
> However we do not need to tie this to kick, this is what I wrote:
>
> > BTW Jason, I had the idea to disable callbacks until driver uses the
> > virtio core for the first time (e.g. by calling virtqueue_add* family of
> > APIs). Less aggressive than your ideas but I feel it will add security
> > to the init path at least.
>
> So not necessarily kick, we can make adding buffers allow the
> interrupt.
Some questions:
1) It introduces a code defined behaviour other than depending on the
spec defined behavior like DRIVER_OK, this will lead extra complexity
in auditing
2) there's no guarantee that the interrupt handler is ready before
virtqueue_add(), or it requires barriers before virtqueue_add() to
make sure the handler is commit
So it looks to me the virtio_device_ready() should be still the
correct way to go:
1) it depends on spec defined behaviour like DRIVER_OK, and it then
can comply with possible future security requirement of drivers
defined in the spec
2) choose to use a new boolean instead of reusing vq->broken
3) enable the harden in driver one by one
Does it make sense?
Thanks
>
>
>
> --
> MST
>
^ permalink raw reply
* [PATCH v2] iavf/iavf_main: actually log ->src mask when talking about it
From: Daniil Tatianin @ 2022-12-20 6:32 UTC (permalink / raw)
To: Jesse Brandeburg
Cc: Daniil Tatianin, Tony Nguyen, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Harshitha Ramamurthy, Jeff Kirsher, intel-wired-lan,
netdev, linux-kernel
This fixes a copy-paste issue where dev_err would log the dst mask even
though it is clearly talking about src.
Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.
Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index c4e451ef7942..adc02adef83a 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3850,7 +3850,7 @@ static int iavf_parse_cls_flower(struct iavf_adapter *adapter,
field_flags |= IAVF_CLOUD_FIELD_IIP;
} else {
dev_err(&adapter->pdev->dev, "Bad ip src mask 0x%08x\n",
- be32_to_cpu(match.mask->dst));
+ be32_to_cpu(match.mask->src));
return -EINVAL;
}
}
--
2.25.1
^ permalink raw reply related
* Re: [syzbot] WARNING in put_pmu_ctx
From: syzbot @ 2022-12-20 6:29 UTC (permalink / raw)
To: acme, alexander.shishkin, bpf, jolsa, linux-kernel,
linux-perf-users, mark.rutland, mingo, namhyung, netdev, peterz,
syzkaller-bugs
In-Reply-To: <0000000000009cd81e05f0317886@google.com>
syzbot has found a reproducer for the following issue on:
HEAD commit: e2bb9e01d589 bpf: Remove trace_printk_lock
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=124cf480480000
kernel config: https://syzkaller.appspot.com/x/.config?x=b0e91ad4b5f69c47
dashboard link: https://syzkaller.appspot.com/bug?extid=697196bc0265049822bd
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163fde6f880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17319890480000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/cee993a7fed1/disk-e2bb9e01.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/109057856bce/vmlinux-e2bb9e01.xz
kernel image: https://storage.googleapis.com/syzbot-assets/7da529d16ff7/bzImage-e2bb9e01.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+697196bc0265049822bd@syzkaller.appspotmail.com
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5367 at kernel/events/core.c:4920 put_pmu_ctx kernel/events/core.c:4920 [inline]
WARNING: CPU: 0 PID: 5367 at kernel/events/core.c:4920 put_pmu_ctx+0x2a5/0x390 kernel/events/core.c:4893
Modules linked in:
CPU: 0 PID: 5367 Comm: syz-executor374 Not tainted 6.1.0-syzkaller-09637-ge2bb9e01d589 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
RIP: 0010:put_pmu_ctx kernel/events/core.c:4920 [inline]
RIP: 0010:put_pmu_ctx+0x2a5/0x390 kernel/events/core.c:4893
Code: dd ff e8 2e 0d dd ff 48 8d 7b 50 48 c7 c6 a0 fa a2 81 e8 3e c6 c7 ff eb d6 e8 17 0d dd ff 0f 0b e9 64 ff ff ff e8 0b 0d dd ff <0f> 0b eb 88 e8 c2 bc 2a 00 eb a5 e8 fb 0c dd ff 0f 0b e9 e4 fd ff
RSP: 0018:ffffc90003e2fc68 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8880b9842328 RCX: 0000000000000000
RDX: ffff88802ad5ba80 RSI: ffffffff81a3a605 RDI: 0000000000000001
RBP: ffff8880b9842358 R08: 0000000000000001 R09: 0000000000000001
R10: ffffed1017306cf8 R11: 0000000000000000 R12: ffff8880b9836890
R13: ffff8880b98367c0 R14: 0000000000000293 R15: ffff8880b9842330
FS: 0000555557481300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f31c999e758 CR3: 0000000020f5e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
_free_event+0x3c5/0x13d0 kernel/events/core.c:5196
put_event kernel/events/core.c:5283 [inline]
perf_event_release_kernel+0x6ad/0x8f0 kernel/events/core.c:5395
perf_release+0x37/0x50 kernel/events/core.c:5405
__fput+0x27c/0xa90 fs/file_table.c:320
task_work_run+0x16f/0x270 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f31c9963019
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 a1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff1d0f25c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 00000000000f4240 RCX: 00007f31c9963019
RDX: 00007f31c9963019 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000140 R09: 0000000000000140
R10: 0000000000000008 R11: 0000000000000246 R12: 00007fff1d0f25e0
R13: 00007fff1d0f2600 R14: 0000000000025861 R15: 00007fff1d0f25dc
</TASK>
^ permalink raw reply
* Re: [PULL] Networking for next-6.1
From: Kuniyuki Iwashima @ 2022-12-20 6:28 UTC (permalink / raw)
To: jirislaby
Cc: davem, edumazet, joannelkoong, kuba, kuniyu, linux-kernel, netdev,
pabeni
In-Reply-To: <5bb57ae6-c2a7-e6ea-3fe8-62b8b61bc7c5@kernel.org>
From: Jiri Slaby <jirislaby@kernel.org>
Date: Tue, 20 Dec 2022 07:22:56 +0100
> On 19. 12. 22, 0:25, Kuniyuki Iwashima wrote:
> > From: Jiri Slaby <jirislaby@kernel.org>
> > Date: Fri, 16 Dec 2022 11:49:01 +0100
> >> Hi,
> >>
> >> On 04. 10. 22, 7:20, Jakub Kicinski wrote:
> >>> Joanne Koong (7):
> >>
> >>> net: Add a bhash2 table hashed by port and address
> >>
> >> This makes regression tests of python-ephemeral-port-reserve to fail.
> >>
> >> I'm not sure if the issue is in the commit or in the test.
> >
> > Hi Jiri,
> >
> > Thanks for reporting the issue.
> >
> > It seems we forgot to add TIME_WAIT sockets into bhash2 in
> > inet_twsk_hashdance(), therefore inet_bhash2_conflict() misses
> > TIME_WAIT sockets when validating bind() requests if the address
> > is not a wildcard one.
> >
> > I'll fix it.
>
> Hi,
>
> is there a fix for this available somewhere yet?
Not yet, but I'll CC you when posting a patch.
Thank you.
^ permalink raw reply
* Re: [PULL] Networking for next-6.1
From: Jiri Slaby @ 2022-12-20 6:22 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, joannelkoong, kuba, linux-kernel, netdev, pabeni, edumazet
In-Reply-To: <20221218232547.44526-1-kuniyu@amazon.com>
On 19. 12. 22, 0:25, Kuniyuki Iwashima wrote:
> From: Jiri Slaby <jirislaby@kernel.org>
> Date: Fri, 16 Dec 2022 11:49:01 +0100
>> Hi,
>>
>> On 04. 10. 22, 7:20, Jakub Kicinski wrote:
>>> Joanne Koong (7):
>>
>>> net: Add a bhash2 table hashed by port and address
>>
>> This makes regression tests of python-ephemeral-port-reserve to fail.
>>
>> I'm not sure if the issue is in the commit or in the test.
>
> Hi Jiri,
>
> Thanks for reporting the issue.
>
> It seems we forgot to add TIME_WAIT sockets into bhash2 in
> inet_twsk_hashdance(), therefore inet_bhash2_conflict() misses
> TIME_WAIT sockets when validating bind() requests if the address
> is not a wildcard one.
>
> I'll fix it.
Hi,
is there a fix for this available somewhere yet?
thanks,
--
js
suse labs
^ permalink raw reply
* Re: [PATCH] wifi: rtl8xxxu: fixing transmisison failure for rtl8192eu
From: Jun ASAKA @ 2022-12-20 6:02 UTC (permalink / raw)
To: Ping-Ke Shih, Jes.Sorensen@gmail.com
Cc: kvalo@kernel.org, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com,
linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <3b4124ebabcb4ceaae89cd9ccf84c7de@realtek.com>
On 20/12/2022 13:44, Ping-Ke Shih wrote:
>
>> -----Original Message-----
>> From: Jun ASAKA <JunASAKA@zzy040330.moe>
>> Sent: Saturday, December 17, 2022 11:07 AM
>> To: Jes.Sorensen@gmail.com
>> Cc: kvalo@kernel.org; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
>> linux-wireless@vger.kernel.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Jun ASAKA
>> <JunASAKA@zzy040330.moe>
>> Subject: [PATCH] wifi: rtl8xxxu: fixing transmisison failure for rtl8192eu
>>
>> Fixing transmission failure which results in
>> "authentication with ... timed out". This can be
>> fixed by disable the REG_TXPAUSE.
>>
>> Signed-off-by: Jun ASAKA <JunASAKA@zzy040330.moe>
>> ---
>> drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
>> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
>> index a7d76693c02d..9d0ed6760cb6 100644
>> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
>> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
>> @@ -1744,6 +1744,11 @@ static void rtl8192e_enable_rf(struct rtl8xxxu_priv *priv)
>> val8 = rtl8xxxu_read8(priv, REG_PAD_CTRL1);
>> val8 &= ~BIT(0);
>> rtl8xxxu_write8(priv, REG_PAD_CTRL1, val8);
>> +
>> + /*
>> + * Fix transmission failure of rtl8192e.
>> + */
>> + rtl8xxxu_write8(priv, REG_TXPAUSE, 0x00);
> I trace when rtl8xxxu set REG_TXPAUSE=0xff that will stop TX.
> The occasions include RF calibration, LPS mode (called by power off), and
> going to stop. So, I think RF calibration does TX pause but not restore
> settings after calibration, and causes TX stuck. As the flow I traced,
> this patch looks reasonable. But, I wonder why other people don't meet
> this problem.
>
> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
>
>> }
>>
>> static s8 rtl8192e_cck_rssi(struct rtl8xxxu_priv *priv, u8 cck_agc_rpt)
>> --
>> 2.31.1
For my occasion, one of my rtl8192ru device which is Tenda U1 doesn't
work originally with this module, it prints "authentication with ...
timed out" in dmesg. And this change can fix the problem.
Thanks for your review.
Jun ASAKA.
^ permalink raw reply
* Re: [PATCH 2/3] can: esd_usb: Improved behavior on esd CAN_ERROR_EXT event (2)
From: Vincent MAILHOL @ 2022-12-20 5:49 UTC (permalink / raw)
To: Frank Jungclaus
Cc: linux-can, Marc Kleine-Budde, Wolfgang Grandegger,
Stefan Mätje, netdev, linux-kernel
In-Reply-To: <20221219212717.1298282-1-frank.jungclaus@esd.eu>
On Tue. 20 Dec. 2022 at 06:29, Frank Jungclaus <frank.jungclaus@esd.eu> wrote:
> Started a rework initiated by Vincents remarks "You should not report
> the greatest of txerr and rxerr but the one which actually increased."
> [1]
I do not see this comment being addressed. You are still assigning the
flags depending on the highest value, not the one which actually
changed.
> and "As far as I understand, those flags should be set only when
> the threshold is *reached*" [2] .
>
> Now setting the flags for CAN_ERR_CRTL_[RT]X_WARNING and
> CAN_ERR_CRTL_[RT]X_PASSIVE regarding REC and TEC, when the
> appropriate threshold is reached.
>
> Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
> Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
> Link: [1] https://lore.kernel.org/all/CAMZ6RqKGBWe15aMkf8-QLf-cOQg99GQBebSm+1wEzTqHgvmNuw@mail.gmail.com/
> Link: [2] https://lore.kernel.org/all/CAMZ6Rq+QBO1yTX_o6GV0yhdBj-RzZSRGWDZBS0fs7zbSTy4hmA@mail.gmail.com/
> ---
> drivers/net/can/usb/esd_usb.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/can/usb/esd_usb.c b/drivers/net/can/usb/esd_usb.c
> index 5e182fadd875..09745751f168 100644
> --- a/drivers/net/can/usb/esd_usb.c
> +++ b/drivers/net/can/usb/esd_usb.c
> @@ -255,10 +255,18 @@ static void esd_usb_rx_event(struct esd_usb_net_priv *priv,
> can_bus_off(priv->netdev);
> break;
> case ESD_BUSSTATE_WARN:
> + cf->can_id |= CAN_ERR_CRTL;
> + cf->data[1] = (txerr > rxerr) ?
> + CAN_ERR_CRTL_TX_WARNING :
> + CAN_ERR_CRTL_RX_WARNING;
Nitpick: when a ternary operator is too long to fit on one line,
prefer an if/else.
> priv->can.state = CAN_STATE_ERROR_WARNING;
> priv->can.can_stats.error_warning++;
> break;
> case ESD_BUSSTATE_ERRPASSIVE:
> + cf->can_id |= CAN_ERR_CRTL;
> + cf->data[1] = (txerr > rxerr) ?
> + CAN_ERR_CRTL_TX_PASSIVE :
> + CAN_ERR_CRTL_RX_PASSIVE;
Same.
> priv->can.state = CAN_STATE_ERROR_PASSIVE;
> priv->can.can_stats.error_passive++;
> break;
> @@ -296,12 +304,6 @@ static void esd_usb_rx_event(struct esd_usb_net_priv *priv,
> /* Bit stream position in CAN frame as the error was detected */
> cf->data[3] = ecc & SJA1000_ECC_SEG;
>
> - if (priv->can.state == CAN_STATE_ERROR_WARNING ||
> - priv->can.state == CAN_STATE_ERROR_PASSIVE) {
> - cf->data[1] = (txerr > rxerr) ?
> - CAN_ERR_CRTL_TX_PASSIVE :
> - CAN_ERR_CRTL_RX_PASSIVE;
> - }
> cf->data[6] = txerr;
> cf->data[7] = rxerr;
> }
Yours sincerely,
Vincent Mailhol
^ permalink raw reply
* RE: [PATCH] wifi: rtl8xxxu: fixing transmisison failure for rtl8192eu
From: Ping-Ke Shih @ 2022-12-20 5:44 UTC (permalink / raw)
To: Jun ASAKA, Jes.Sorensen@gmail.com
Cc: kvalo@kernel.org, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com,
linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20221217030659.12577-1-JunASAKA@zzy040330.moe>
> -----Original Message-----
> From: Jun ASAKA <JunASAKA@zzy040330.moe>
> Sent: Saturday, December 17, 2022 11:07 AM
> To: Jes.Sorensen@gmail.com
> Cc: kvalo@kernel.org; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
> linux-wireless@vger.kernel.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Jun ASAKA
> <JunASAKA@zzy040330.moe>
> Subject: [PATCH] wifi: rtl8xxxu: fixing transmisison failure for rtl8192eu
>
> Fixing transmission failure which results in
> "authentication with ... timed out". This can be
> fixed by disable the REG_TXPAUSE.
>
> Signed-off-by: Jun ASAKA <JunASAKA@zzy040330.moe>
> ---
> drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> index a7d76693c02d..9d0ed6760cb6 100644
> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> @@ -1744,6 +1744,11 @@ static void rtl8192e_enable_rf(struct rtl8xxxu_priv *priv)
> val8 = rtl8xxxu_read8(priv, REG_PAD_CTRL1);
> val8 &= ~BIT(0);
> rtl8xxxu_write8(priv, REG_PAD_CTRL1, val8);
> +
> + /*
> + * Fix transmission failure of rtl8192e.
> + */
> + rtl8xxxu_write8(priv, REG_TXPAUSE, 0x00);
I trace when rtl8xxxu set REG_TXPAUSE=0xff that will stop TX.
The occasions include RF calibration, LPS mode (called by power off), and
going to stop. So, I think RF calibration does TX pause but not restore
settings after calibration, and causes TX stuck. As the flow I traced,
this patch looks reasonable. But, I wonder why other people don't meet
this problem.
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
> }
>
> static s8 rtl8192e_cck_rssi(struct rtl8xxxu_priv *priv, u8 cck_agc_rpt)
> --
> 2.31.1
^ permalink raw reply
* Re: [PATCH 3/3] can: esd_usb: Improved decoding for ESD_EV_CAN_ERROR_EXT messages
From: Vincent MAILHOL @ 2022-12-20 5:27 UTC (permalink / raw)
To: Frank Jungclaus
Cc: linux-can, Marc Kleine-Budde, Wolfgang Grandegger,
Stefan Mätje, netdev, linux-kernel
In-Reply-To: <20221219212717.1298282-2-frank.jungclaus@esd.eu>
Le mar. 20 déc. 2022 à 06:28, Frank Jungclaus <frank.jungclaus@esd.eu> a écrit :
>
> As suggested by Marc there now is a union plus a struct ev_can_err_ext
> for easier decoding of an ESD_EV_CAN_ERROR_EXT event message (which
> simply is a rx_msg with some dedicated data).
>
> Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/linux-can/20220621071152.ggyhrr5sbzvwpkpx@pengutronix.de/
> Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
> ---
> drivers/net/can/usb/esd_usb.c | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/can/usb/esd_usb.c b/drivers/net/can/usb/esd_usb.c
> index 09745751f168..f90bb2c0ba15 100644
> --- a/drivers/net/can/usb/esd_usb.c
> +++ b/drivers/net/can/usb/esd_usb.c
> @@ -127,7 +127,15 @@ struct rx_msg {
> u8 dlc;
> __le32 ts;
> __le32 id; /* upper 3 bits contain flags */
> - u8 data[8];
> + union {
> + u8 data[8];
> + struct {
> + u8 status; /* CAN Controller Status */
> + u8 ecc; /* Error Capture Register */
> + u8 rec; /* RX Error Counter */
> + u8 tec; /* TX Error Counter */
> + } ev_can_err_ext; /* For ESD_EV_CAN_ERROR_EXT */
> + };
> };
>
> struct tx_msg {
> @@ -229,10 +237,10 @@ static void esd_usb_rx_event(struct esd_usb_net_priv *priv,
> u32 id = le32_to_cpu(msg->msg.rx.id) & ESD_IDMASK;
>
> if (id == ESD_EV_CAN_ERROR_EXT) {
> - u8 state = msg->msg.rx.data[0];
> - u8 ecc = msg->msg.rx.data[1];
> - u8 rxerr = msg->msg.rx.data[2];
> - u8 txerr = msg->msg.rx.data[3];
> + u8 state = msg->msg.rx.ev_can_err_ext.status;
> + u8 ecc = msg->msg.rx.ev_can_err_ext.ecc;
> + u8 rxerr = msg->msg.rx.ev_can_err_ext.rec;
> + u8 txerr = msg->msg.rx.ev_can_err_ext.tec;
I do not like how you have to write msg->msg.rx.something. I think it
would be better to make the union within struct esd_usb_msg anonymous:
https://elixir.bootlin.com/linux/latest/source/drivers/net/can/usb/esd_usb.c#L169
That said, this is not a criticism of this patch but more something to
be addressed in a separate clean-up patch.
> netdev_dbg(priv->netdev,
> "CAN_ERR_EV_EXT: dlc=%#02x state=%02x ecc=%02x rec=%02x tec=%02x\n",
> --
> 2.25.1
>
^ permalink raw reply
* Re: [PATCH 1/3] can: esd_usb: Improved behavior on esd CAN_ERROR_EXT event (1)
From: Vincent MAILHOL @ 2022-12-20 5:16 UTC (permalink / raw)
To: Frank Jungclaus
Cc: linux-can, Marc Kleine-Budde, Wolfgang Grandegger,
Stefan Mätje, netdev, linux-kernel
In-Reply-To: <20221219212013.1294820-2-frank.jungclaus@esd.eu>
On Tue. 20 Dec. 2022 at 06:25, Frank Jungclaus <frank.jungclaus@esd.eu> wrote:
>
> Moved the supply for cf->data[3] (bit stream position of CAN error)
> outside of the "switch (ecc & SJA1000_ECC_MASK){}"-statement, because
> this position is independent of the error type.
>
> Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
> Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
> ---
> drivers/net/can/usb/esd_usb.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/can/usb/esd_usb.c b/drivers/net/can/usb/esd_usb.c
> index 42323f5e6f3a..5e182fadd875 100644
> --- a/drivers/net/can/usb/esd_usb.c
> +++ b/drivers/net/can/usb/esd_usb.c
> @@ -286,7 +286,6 @@ static void esd_usb_rx_event(struct esd_usb_net_priv *priv,
> cf->data[2] |= CAN_ERR_PROT_STUFF;
> break;
> default:
> - cf->data[3] = ecc & SJA1000_ECC_SEG;
> break;
> }
>
> @@ -294,6 +293,9 @@ static void esd_usb_rx_event(struct esd_usb_net_priv *priv,
> if (!(ecc & SJA1000_ECC_DIR))
> cf->data[2] |= CAN_ERR_PROT_TX;
>
> + /* Bit stream position in CAN frame as the error was detected */
> + cf->data[3] = ecc & SJA1000_ECC_SEG;
Can you confirm that the value returned by the device matches the
specifications from linux/can/error.h?
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/can/error.h#L90
> if (priv->can.state == CAN_STATE_ERROR_WARNING ||
> priv->can.state == CAN_STATE_ERROR_PASSIVE) {
> cf->data[1] = (txerr > rxerr) ?
> --
> 2.25.1
>
^ permalink raw reply
* [PATCH] sctp: Make sha1 as default algorithm if fips is enabled
From: Ashwin Dayanand Kamat @ 2022-12-20 5:10 UTC (permalink / raw)
To: Vlad Yasevich, Neil Horman, Marcelo Ricardo Leitner,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-sctp, netdev, linux-kernel
Cc: Ashwin Dayanand Kamat, srivatsab, srivatsa, amakhalov,
vsirnapalli, akaher
MD5 is not FIPS compliant. But still md5 was used as the default algorithm
for sctp if fips was enabled.
Due to this, listen() system call in ltp tests was failing for sctp
in fips environment, with below error message.
[ 6397.892677] sctp: failed to load transform for md5: -2
Fix is to not assign md5 as default algorithm for sctp
if fips_enabled is true. Instead make sha1 as default algorithm.
Signed-off-by: Ashwin Dayanand Kamat <kashwindayan@vmware.com>
---
net/sctp/protocol.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 909a89a..b6e9810 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -34,6 +34,7 @@
#include <linux/memblock.h>
#include <linux/highmem.h>
#include <linux/slab.h>
+#include <linux/fips.h>
#include <net/net_namespace.h>
#include <net/protocol.h>
#include <net/ip.h>
@@ -1321,14 +1322,13 @@ static int __net_init sctp_defaults_init(struct net *net)
/* Whether Cookie Preservative is enabled(1) or not(0) */
net->sctp.cookie_preserve_enable = 1;
- /* Default sctp sockets to use md5 as their hmac alg */
-#if defined (CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5)
- net->sctp.sctp_hmac_alg = "md5";
-#elif defined (CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1)
- net->sctp.sctp_hmac_alg = "sha1";
-#else
- net->sctp.sctp_hmac_alg = NULL;
-#endif
+ /* Default sctp sockets to use md5 as default only if fips is not enabled */
+ if (!fips_enabled && IS_ENABLED(CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5))
+ net->sctp.sctp_hmac_alg = "md5";
+ else if (IS_ENABLED(CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1))
+ net->sctp.sctp_hmac_alg = "sha1";
+ else
+ net->sctp.sctp_hmac_alg = NULL;
/* Max.Burst - 4 */
net->sctp.max_burst = SCTP_DEFAULT_MAX_BURST;
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox