From: Kuniyuki Iwashima <kuniyu@google.com>
To: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>
Cc: John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Neal Cardwell <ncardwell@google.com>,
Willem de Bruijn <willemb@google.com>,
Mina Almasry <almasrymina@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH v2 bpf-next/net 1/8] tcp: Save lock_sock() for memcg in inet_csk_accept().
Date: Mon, 25 Aug 2025 20:41:24 +0000 [thread overview]
Message-ID: <20250825204158.2414402-2-kuniyu@google.com> (raw)
In-Reply-To: <20250825204158.2414402-1-kuniyu@google.com>
If memcg is enabled, accept() acquires lock_sock() twice for each new
TCP/MPTCP socket in inet_csk_accept() and __inet_accept().
Let's move memcg operations from inet_csk_accept() to __inet_accept().
This makes easier to add a BPF hook that covers sk_prot.memory_allocated
users (TCP, MPTCP, SCTP) in a single place.
Two notes:
1)
SCTP somehow allocates a new socket by sk_alloc() in sk->sk_prot->accept()
and clones fields manually, instead of using sk_clone_lock().
For SCTP, mem_cgroup_sk_alloc() has been called before __inet_accept(),
so I added the protocol tests in __inet_accept(), but this can be removed
once SCTP uses sk_clone_lock().
2)
The single if block is separated into two because we will add a new bpf
hook between the blocks, where a bpf prog can add a flag in sk->sk_memcg.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/ipv4/af_inet.c | 22 ++++++++++++++++++++++
net/ipv4/inet_connection_sock.c | 25 -------------------------
2 files changed, 22 insertions(+), 25 deletions(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 76e38092cd8a..ae83ecda3983 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -753,6 +753,28 @@ EXPORT_SYMBOL(inet_stream_connect);
void __inet_accept(struct socket *sock, struct socket *newsock, struct sock *newsk)
{
+ gfp_t gfp = GFP_KERNEL | __GFP_NOFAIL;
+
+ /* TODO: use sk_clone_lock() in SCTP and remove protocol checks */
+ if (mem_cgroup_sockets_enabled &&
+ (!IS_ENABLED(CONFIG_IP_SCTP) ||
+ sk_is_tcp(newsk) || sk_is_mptcp(newsk))) {
+ mem_cgroup_sk_alloc(newsk);
+ kmem_cache_charge(newsk, gfp);
+ }
+
+ if (mem_cgroup_sk_enabled(newsk)) {
+ int amt;
+
+ /* The socket has not been accepted yet, no need
+ * to look at newsk->sk_wmem_queued.
+ */
+ amt = sk_mem_pages(newsk->sk_forward_alloc +
+ atomic_read(&newsk->sk_rmem_alloc));
+ if (amt)
+ mem_cgroup_sk_charge(newsk, amt, gfp);
+ }
+
sock_rps_record_flow(newsk);
WARN_ON(!((1 << newsk->sk_state) &
(TCPF_ESTABLISHED | TCPF_SYN_RECV |
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 0ef1eacd539d..ed10b959a906 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -708,31 +708,6 @@ struct sock *inet_csk_accept(struct sock *sk, struct proto_accept_arg *arg)
release_sock(sk);
- if (mem_cgroup_sockets_enabled) {
- gfp_t gfp = GFP_KERNEL | __GFP_NOFAIL;
- int amt = 0;
-
- /* atomically get the memory usage, set and charge the
- * newsk->sk_memcg.
- */
- lock_sock(newsk);
-
- mem_cgroup_sk_alloc(newsk);
- if (mem_cgroup_from_sk(newsk)) {
- /* The socket has not been accepted yet, no need
- * to look at newsk->sk_wmem_queued.
- */
- amt = sk_mem_pages(newsk->sk_forward_alloc +
- atomic_read(&newsk->sk_rmem_alloc));
- }
-
- if (amt)
- mem_cgroup_sk_charge(newsk, amt, gfp);
- kmem_cache_charge(newsk, gfp);
-
- release_sock(newsk);
- }
-
if (req)
reqsk_put(req);
--
2.51.0.261.g7ce5a0a67e-goog
next prev parent reply other threads:[~2025-08-25 20:42 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 20:41 [PATCH v2 bpf-next/net 0/8] bpf: Allow decoupling memcg from sk->sk_prot->memory_allocated Kuniyuki Iwashima
2025-08-25 20:41 ` Kuniyuki Iwashima [this message]
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 2/8] bpf: Add a bpf hook in __inet_accept() Kuniyuki Iwashima
2025-08-25 20:51 ` Stanislav Fomichev
2025-08-25 21:07 ` Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 3/8] libbpf: Support BPF_CGROUP_INET_SOCK_ACCEPT Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 4/8] bpftool: " Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 5/8] bpf: Support bpf_setsockopt() for BPF_CGROUP_INET_SOCK_(CREATE|ACCEPT) Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 6/8] bpf: Introduce SK_BPF_MEMCG_FLAGS and SK_BPF_MEMCG_SOCK_ISOLATED Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 7/8] net-memcg: Allow decoupling memcg from global protocol memory accounting Kuniyuki Iwashima
2025-08-25 20:41 ` [PATCH v2 bpf-next/net 8/8] selftest: bpf: Add test for SK_BPF_MEMCG_SOCK_ISOLATED Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250825204158.2414402-2-kuniyu@google.com \
--to=kuniyu@google.com \
--cc=almasrymina@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hannes@cmpxchg.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mhocko@kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=roman.gushchin@linux.dev \
--cc=sdf@fomichev.me \
--cc=shakeel.butt@linux.dev \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.