From: Cong Wang <xiyou.wangcong@gmail.com>
To: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com,
wangdongdong.6@bytedance.com, jiang.wang@bytedance.com,
Cong Wang <cong.wang@bytedance.com>,
John Fastabend <john.fastabend@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Jakub Sitnicki <jakub@cloudflare.com>,
Lorenz Bauer <lmb@cloudflare.com>
Subject: [Patch bpf-next v4 04/11] skmsg: avoid lock_sock() in sk_psock_backlog()
Date: Tue, 9 Mar 2021 21:32:15 -0800 [thread overview]
Message-ID: <20210310053222.41371-5-xiyou.wangcong@gmail.com> (raw)
In-Reply-To: <20210310053222.41371-1-xiyou.wangcong@gmail.com>
From: Cong Wang <cong.wang@bytedance.com>
We do not have to lock the sock to avoid losing sk_socket,
instead we can purge all the ingress queues when we close
the socket. Sending or receiving packets after orphaning
socket makes no sense.
We do purge these queues when psock refcnt reaches 0 but
here we want to purge them explicitly in sock_map_close().
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
include/linux/skmsg.h | 1 +
net/core/skmsg.c | 22 ++++++++++++++--------
net/core/sock_map.c | 1 +
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 7333bf881b81..91b357817bb8 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -347,6 +347,7 @@ static inline void sk_psock_report_error(struct sk_psock *psock, int err)
}
struct sk_psock *sk_psock_init(struct sock *sk, int node);
+void sk_psock_purge(struct sk_psock *psock);
#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock);
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 41a5f82c53e6..bf0f874780c1 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -497,7 +497,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
if (!ingress) {
if (!sock_writeable(psock->sk))
return -EAGAIN;
- return skb_send_sock_locked(psock->sk, skb, off, len);
+ return skb_send_sock(psock->sk, skb, off, len);
}
return sk_psock_skb_ingress(psock, skb);
}
@@ -511,8 +511,6 @@ static void sk_psock_backlog(struct work_struct *work)
u32 len, off;
int ret;
- /* Lock sock to avoid losing sk_socket during loop. */
- lock_sock(psock->sk);
if (state->skb) {
skb = state->skb;
len = state->len;
@@ -529,7 +527,7 @@ static void sk_psock_backlog(struct work_struct *work)
skb_bpf_redirect_clear(skb);
do {
ret = -EIO;
- if (likely(psock->sk->sk_socket))
+ if (!sock_flag(psock->sk, SOCK_DEAD))
ret = sk_psock_handle_skb(psock, skb, off,
len, ingress);
if (ret <= 0) {
@@ -537,13 +535,13 @@ static void sk_psock_backlog(struct work_struct *work)
state->skb = skb;
state->len = len;
state->off = off;
- goto end;
+ return;
}
/* Hard errors break pipe and stop xmit. */
sk_psock_report_error(psock, ret ? -ret : EPIPE);
sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
kfree_skb(skb);
- goto end;
+ return;
}
off += ret;
len -= ret;
@@ -552,8 +550,6 @@ static void sk_psock_backlog(struct work_struct *work)
if (!ingress)
kfree_skb(skb);
}
-end:
- release_sock(psock->sk);
}
struct sk_psock *sk_psock_init(struct sock *sk, int node)
@@ -654,6 +650,16 @@ static void sk_psock_link_destroy(struct sk_psock *psock)
}
}
+void sk_psock_purge(struct sk_psock *psock)
+{
+ sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
+
+ cancel_work_sync(&psock->work);
+
+ sk_psock_cork_free(psock);
+ sk_psock_zap_ingress(psock);
+}
+
static void sk_psock_done_strp(struct sk_psock *psock);
static void sk_psock_destroy_deferred(struct work_struct *gc)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index dd53a7771d7e..26ba47b099f1 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1540,6 +1540,7 @@ void sock_map_close(struct sock *sk, long timeout)
saved_close = psock->saved_close;
sock_map_remove_links(sk, psock);
rcu_read_unlock();
+ sk_psock_purge(psock);
release_sock(sk);
saved_close(sk, timeout);
}
--
2.25.1
next prev parent reply other threads:[~2021-03-10 5:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-10 5:32 [Patch bpf-next v4 00/11] sockmap: introduce BPF_SK_SKB_VERDICT and support UDP Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 01/11] skmsg: lock ingress_skb when purging Cong Wang
2021-03-11 10:52 ` Jakub Sitnicki
2021-03-10 5:32 ` [Patch bpf-next v4 02/11] skmsg: introduce a spinlock to protect ingress_msg Cong Wang
2021-03-11 11:28 ` Jakub Sitnicki
2021-03-12 0:45 ` Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 03/11] skmsg: introduce skb_send_sock() for sock_map Cong Wang
2021-03-11 11:42 ` Jakub Sitnicki
2021-03-12 0:47 ` Cong Wang
2021-03-10 5:32 ` Cong Wang [this message]
2021-03-12 12:02 ` [Patch bpf-next v4 04/11] skmsg: avoid lock_sock() in sk_psock_backlog() Jakub Sitnicki
2021-03-13 17:32 ` Cong Wang
2021-03-15 20:55 ` Jakub Sitnicki
2021-03-10 5:32 ` [Patch bpf-next v4 05/11] sock_map: introduce BPF_SK_SKB_VERDICT Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 06/11] sock: introduce sk->sk_prot->psock_update_sk_prot() Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 07/11] udp: implement ->read_sock() for sockmap Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 08/11] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 09/11] udp: implement udp_bpf_recvmsg() for sockmap Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 10/11] sock_map: update sock type checks for UDP Cong Wang
2021-03-10 5:32 ` [Patch bpf-next v4 11/11] selftests/bpf: add a test case for udp sockmap Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210310053222.41371-5-xiyou.wangcong@gmail.com \
--to=xiyou.wangcong@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=cong.wang@bytedance.com \
--cc=daniel@iogearbox.net \
--cc=duanxiongchun@bytedance.com \
--cc=jakub@cloudflare.com \
--cc=jiang.wang@bytedance.com \
--cc=john.fastabend@gmail.com \
--cc=lmb@cloudflare.com \
--cc=netdev@vger.kernel.org \
--cc=wangdongdong.6@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).