All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf 0/3] sockmap fixes picked up by stress tests
@ 2021-07-19 21:48 John Fastabend
  2021-07-19 21:48 ` [PATCH bpf 1/3] bpf, sockmap: zap ingress queues after stopping strparser John Fastabend
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: John Fastabend @ 2021-07-19 21:48 UTC (permalink / raw)
  To: jakub, daniel, xiyou.wangcong, alexei.starovoitov
  Cc: john.fastabend, bpf, netdev

Running stress tests with recent patch to remove an extra lock in sockmap
resulted in a couple new issues popping up. It seems only one of them
is actually related to the patch:

799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()")

The other two issues had existed long before, but I guess the timing
with the serialization we had before was too tight to get any of
our tests or deployments to hit it.

With attached series stress testing sockmap+TCP with workloads that
create lots of short-lived connections no more splats like below were
seen on upstream bpf branch.

[224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
[224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
[224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G          I       5.14.0-rc1alu+ #181
[224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220
[224913.935923] Code: 8b 83 20 02 00 00 85 c0 75 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 df e8 2b 11 fe ff eb c3 0f 0b e9 7c ff ff ff 0f 0b eb ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 0f 1f 44 00 00 41 57 41
[224913.935932] RSP: 0018:ffff88816271fd38 EFLAGS: 00010206
[224913.935941] RAX: 0000000000000ae8 RBX: ffff88815acd5240 RCX: dffffc0000000000
[224913.935948] RDX: 0000000000000003 RSI: 0000000000000ae8 RDI: ffff88815acd5460
[224913.935954] RBP: ffff88815acd5460 R08: ffffffff955c0ae8 R09: fffffbfff2e6f543
[224913.935961] R10: ffffffff9737aa17 R11: fffffbfff2e6f542 R12: ffff88815acd5390
[224913.935967] R13: ffff88815acd5480 R14: ffffffff98d0c080 R15: ffffffff96267500
[224913.935974] FS:  00007f86e6bd1700(0000) GS:ffff888451cc0000(0000) knlGS:0000000000000000
[224913.935981] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[224913.935988] CR2: 000000c0008eb000 CR3: 00000001020e0005 CR4: 00000000003706e0
[224913.935994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[224913.936000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[224913.936007] Call Trace:
[224913.936016]  inet_csk_destroy_sock+0xba/0x1f0
[224913.936033]  __tcp_close+0x620/0x790
[224913.936047]  tcp_close+0x20/0x80
[224913.936056]  inet_release+0x8f/0xf0
[224913.936070]  __sock_release+0x72/0x120

John Fastabend (3):
  bpf, sockmap: zap ingress queues after stopping strparser
  bpf, sockmap: on cleanup we additionally need to remove cached skb
  bpf, sockmap: fix memleak on ingress msg enqueue

 include/linux/skmsg.h | 54 ++++++++++++++++++++++++++++---------------
 net/core/skmsg.c      | 37 +++++++++++++++++++++--------
 2 files changed, 62 insertions(+), 29 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [PATCH bpf 2/3] bpf, sockmap: on cleanup we additionally need to remove cached skb
  2021-07-19 21:48 ` [PATCH bpf 2/3] bpf, sockmap: on cleanup we additionally need to remove cached skb John Fastabend
  2021-07-21 10:13   ` Jakub Sitnicki
@ 2021-07-21  9:38 ` Dan Carpenter
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2021-07-20 10:53 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 5474 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210719214834.125484-3-john.fastabend@gmail.com>
References: <20210719214834.125484-3-john.fastabend@gmail.com>
TO: John Fastabend <john.fastabend@gmail.com>
TO: jakub(a)cloudflare.com
TO: daniel(a)iogearbox.net
TO: xiyou.wangcong(a)gmail.com
TO: alexei.starovoitov(a)gmail.com
CC: john.fastabend(a)gmail.com
CC: bpf(a)vger.kernel.org
CC: netdev(a)vger.kernel.org

Hi John,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf/master]

url:    https://github.com/0day-ci/linux/commits/John-Fastabend/sockmap-fixes-picked-up-by-stress-tests/20210720-144138
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
:::::: branch date: 4 hours ago
:::::: commit date: 4 hours ago
config: i386-randconfig-m021-20210720 (attached as .config)
compiler: gcc-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
net/core/skmsg.c:627 sk_psock_backlog() error: uninitialized symbol 'skb'.
net/core/skmsg.c:639 sk_psock_backlog() error: uninitialized symbol 'off'.
net/core/skmsg.c:640 sk_psock_backlog() error: uninitialized symbol 'len'.

vim +/skb +627 net/core/skmsg.c

d1f6b1c794e27f John Fastabend  2021-07-19  608  
604326b41a6fb9 Daniel Borkmann 2018-10-13  609  static void sk_psock_backlog(struct work_struct *work)
604326b41a6fb9 Daniel Borkmann 2018-10-13  610  {
604326b41a6fb9 Daniel Borkmann 2018-10-13  611  	struct sk_psock *psock = container_of(work, struct sk_psock, work);
604326b41a6fb9 Daniel Borkmann 2018-10-13  612  	struct sk_psock_work_state *state = &psock->work_state;
604326b41a6fb9 Daniel Borkmann 2018-10-13  613  	struct sk_buff *skb;
604326b41a6fb9 Daniel Borkmann 2018-10-13  614  	bool ingress;
604326b41a6fb9 Daniel Borkmann 2018-10-13  615  	u32 len, off;
604326b41a6fb9 Daniel Borkmann 2018-10-13  616  	int ret;
604326b41a6fb9 Daniel Borkmann 2018-10-13  617  
799aa7f98d53e0 Cong Wang       2021-03-30  618  	mutex_lock(&psock->work_mutex);
d1f6b1c794e27f John Fastabend  2021-07-19  619  	if (unlikely(state->skb)) {
d1f6b1c794e27f John Fastabend  2021-07-19  620  		spin_lock_bh(&psock->ingress_lock);
604326b41a6fb9 Daniel Borkmann 2018-10-13  621  		skb = state->skb;
604326b41a6fb9 Daniel Borkmann 2018-10-13  622  		len = state->len;
604326b41a6fb9 Daniel Borkmann 2018-10-13  623  		off = state->off;
604326b41a6fb9 Daniel Borkmann 2018-10-13  624  		state->skb = NULL;
d1f6b1c794e27f John Fastabend  2021-07-19  625  		spin_unlock_bh(&psock->ingress_lock);
604326b41a6fb9 Daniel Borkmann 2018-10-13  626  	}
d1f6b1c794e27f John Fastabend  2021-07-19 @627  	if (skb)
d1f6b1c794e27f John Fastabend  2021-07-19  628  		goto start;
604326b41a6fb9 Daniel Borkmann 2018-10-13  629  
604326b41a6fb9 Daniel Borkmann 2018-10-13  630  	while ((skb = skb_dequeue(&psock->ingress_skb))) {
604326b41a6fb9 Daniel Borkmann 2018-10-13  631  		len = skb->len;
604326b41a6fb9 Daniel Borkmann 2018-10-13  632  		off = 0;
604326b41a6fb9 Daniel Borkmann 2018-10-13  633  start:
e3526bb92a2084 Cong Wang       2021-02-23  634  		ingress = skb_bpf_ingress(skb);
e3526bb92a2084 Cong Wang       2021-02-23  635  		skb_bpf_redirect_clear(skb);
604326b41a6fb9 Daniel Borkmann 2018-10-13  636  		do {
604326b41a6fb9 Daniel Borkmann 2018-10-13  637  			ret = -EIO;
799aa7f98d53e0 Cong Wang       2021-03-30  638  			if (!sock_flag(psock->sk, SOCK_DEAD))
604326b41a6fb9 Daniel Borkmann 2018-10-13 @639  				ret = sk_psock_handle_skb(psock, skb, off,
604326b41a6fb9 Daniel Borkmann 2018-10-13 @640  							  len, ingress);
604326b41a6fb9 Daniel Borkmann 2018-10-13  641  			if (ret <= 0) {
604326b41a6fb9 Daniel Borkmann 2018-10-13  642  				if (ret == -EAGAIN) {
d1f6b1c794e27f John Fastabend  2021-07-19  643  					sk_psock_skb_state(psock, state, skb,
d1f6b1c794e27f John Fastabend  2021-07-19  644  							   len, off);
604326b41a6fb9 Daniel Borkmann 2018-10-13  645  					goto end;
604326b41a6fb9 Daniel Borkmann 2018-10-13  646  				}
604326b41a6fb9 Daniel Borkmann 2018-10-13  647  				/* Hard errors break pipe and stop xmit. */
604326b41a6fb9 Daniel Borkmann 2018-10-13  648  				sk_psock_report_error(psock, ret ? -ret : EPIPE);
604326b41a6fb9 Daniel Borkmann 2018-10-13  649  				sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
781dd0431eb549 Cong Wang       2021-06-14  650  				sock_drop(psock->sk, skb);
604326b41a6fb9 Daniel Borkmann 2018-10-13  651  				goto end;
604326b41a6fb9 Daniel Borkmann 2018-10-13  652  			}
604326b41a6fb9 Daniel Borkmann 2018-10-13  653  			off += ret;
604326b41a6fb9 Daniel Borkmann 2018-10-13  654  			len -= ret;
604326b41a6fb9 Daniel Borkmann 2018-10-13  655  		} while (len);
604326b41a6fb9 Daniel Borkmann 2018-10-13  656  
604326b41a6fb9 Daniel Borkmann 2018-10-13  657  		if (!ingress)
604326b41a6fb9 Daniel Borkmann 2018-10-13  658  			kfree_skb(skb);
604326b41a6fb9 Daniel Borkmann 2018-10-13  659  	}
604326b41a6fb9 Daniel Borkmann 2018-10-13  660  end:
799aa7f98d53e0 Cong Wang       2021-03-30  661  	mutex_unlock(&psock->work_mutex);
604326b41a6fb9 Daniel Borkmann 2018-10-13  662  }
604326b41a6fb9 Daniel Borkmann 2018-10-13  663  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 39368 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-21 18:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-19 21:48 [PATCH bpf 0/3] sockmap fixes picked up by stress tests John Fastabend
2021-07-19 21:48 ` [PATCH bpf 1/3] bpf, sockmap: zap ingress queues after stopping strparser John Fastabend
2021-07-20 10:27   ` Jakub Sitnicki
2021-07-20 17:41     ` John Fastabend
2021-07-19 21:48 ` [PATCH bpf 2/3] bpf, sockmap: on cleanup we additionally need to remove cached skb John Fastabend
2021-07-21 10:13   ` Jakub Sitnicki
2021-07-19 21:48 ` [PATCH bpf 3/3] bpf, sockmap: fix memleak on ingress msg enqueue John Fastabend
2021-07-21 16:36   ` Jakub Sitnicki
2021-07-21 18:02   ` Martin KaFai Lau
2021-07-21 16:38 ` [PATCH bpf 0/3] sockmap fixes picked up by stress tests Jakub Sitnicki
  -- strict thread matches above, loose matches on Subject: below --
2021-07-20 10:53 [PATCH bpf 2/3] bpf, sockmap: on cleanup we additionally need to remove cached skb kernel test robot
2021-07-21  9:38 ` Dan Carpenter
2021-07-21  9:38 ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.