* [PATCH ipsec 0/2] fix some leaks in espintcp
@ 2025-04-09 13:59 Sabrina Dubroca
2025-04-09 13:59 ` [PATCH ipsec 1/2] espintcp: fix skb leaks Sabrina Dubroca
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Sabrina Dubroca @ 2025-04-09 13:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Herbert Xu, Steffen Klassert
kmemleak spotted a few leaks that have been here since the beginning.
Sabrina Dubroca (2):
espintcp: fix skb leaks
espintcp: remove encap socket caching to avoid reference leak
include/net/xfrm.h | 1 -
net/ipv4/esp4.c | 53 ++++++-------------------------------------
net/ipv6/esp6.c | 53 ++++++-------------------------------------
net/xfrm/espintcp.c | 4 +++-
net/xfrm/xfrm_state.c | 3 ---
5 files changed, 17 insertions(+), 97 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH ipsec 1/2] espintcp: fix skb leaks
2025-04-09 13:59 [PATCH ipsec 0/2] fix some leaks in espintcp Sabrina Dubroca
@ 2025-04-09 13:59 ` Sabrina Dubroca
2025-04-11 13:58 ` Simon Horman
2025-04-09 13:59 ` [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak Sabrina Dubroca
2025-04-14 10:00 ` [PATCH ipsec 0/2] fix some leaks in espintcp Steffen Klassert
2 siblings, 1 reply; 6+ messages in thread
From: Sabrina Dubroca @ 2025-04-09 13:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Herbert Xu, Steffen Klassert
A few error paths are missing a kfree_skb.
Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/ipv4/esp4.c | 4 +++-
net/ipv6/esp6.c | 4 +++-
net/xfrm/espintcp.c | 4 +++-
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 0e4076866c0a..876df672c0bf 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -199,8 +199,10 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb)
sk = esp_find_tcp_sk(x);
err = PTR_ERR_OR_ZERO(sk);
- if (err)
+ if (err) {
+ kfree_skb(skb);
goto out;
+ }
bh_lock_sock(sk);
if (sock_owned_by_user(sk))
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9e73944e3b53..574989b82179 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -216,8 +216,10 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb)
sk = esp6_find_tcp_sk(x);
err = PTR_ERR_OR_ZERO(sk);
- if (err)
+ if (err) {
+ kfree_skb(skb);
goto out;
+ }
bh_lock_sock(sk);
if (sock_owned_by_user(sk))
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index fe82e2d07300..fc7a603b04f1 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -171,8 +171,10 @@ int espintcp_queue_out(struct sock *sk, struct sk_buff *skb)
struct espintcp_ctx *ctx = espintcp_getctx(sk);
if (skb_queue_len(&ctx->out_queue) >=
- READ_ONCE(net_hotdata.max_backlog))
+ READ_ONCE(net_hotdata.max_backlog)) {
+ kfree_skb(skb);
return -ENOBUFS;
+ }
__skb_queue_tail(&ctx->out_queue, skb);
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak
2025-04-09 13:59 [PATCH ipsec 0/2] fix some leaks in espintcp Sabrina Dubroca
2025-04-09 13:59 ` [PATCH ipsec 1/2] espintcp: fix skb leaks Sabrina Dubroca
@ 2025-04-09 13:59 ` Sabrina Dubroca
2025-04-11 13:59 ` Simon Horman
2025-04-14 10:00 ` [PATCH ipsec 0/2] fix some leaks in espintcp Steffen Klassert
2 siblings, 1 reply; 6+ messages in thread
From: Sabrina Dubroca @ 2025-04-09 13:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Herbert Xu, Steffen Klassert
The current scheme for caching the encap socket can lead to reference
leaks when we try to delete the netns.
The reference chain is: xfrm_state -> enacp_sk -> netns
Since the encap socket is a userspace socket, it holds a reference on
the netns. If we delete the espintcp state (through flush or
individual delete) before removing the netns, the reference on the
socket is dropped and the netns is correctly deleted. Otherwise, the
netns may not be reachable anymore (if all processes within the ns
have terminated), so we cannot delete the xfrm state to drop its
reference on the socket.
This patch results in a small (~2% in my tests) performance
regression.
A GC-type mechanism could be added for the socket cache, to clear
references if the state hasn't been used "recently", but it's a lot
more complex than just not caching the socket.
Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
include/net/xfrm.h | 1 -
net/ipv4/esp4.c | 49 ++++---------------------------------------
net/ipv6/esp6.c | 49 ++++---------------------------------------
net/xfrm/xfrm_state.c | 3 ---
4 files changed, 8 insertions(+), 94 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ed4b83696c77..7e698b0306a8 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -236,7 +236,6 @@ struct xfrm_state {
/* Data for encapsulator */
struct xfrm_encap_tmpl *encap;
- struct sock __rcu *encap_sk;
/* NAT keepalive */
u32 nat_keepalive_interval; /* seconds */
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 876df672c0bf..f14a41ee4aa1 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -120,47 +120,16 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb)
}
#ifdef CONFIG_INET_ESPINTCP
-struct esp_tcp_sk {
- struct sock *sk;
- struct rcu_head rcu;
-};
-
-static void esp_free_tcp_sk(struct rcu_head *head)
-{
- struct esp_tcp_sk *esk = container_of(head, struct esp_tcp_sk, rcu);
-
- sock_put(esk->sk);
- kfree(esk);
-}
-
static struct sock *esp_find_tcp_sk(struct xfrm_state *x)
{
struct xfrm_encap_tmpl *encap = x->encap;
struct net *net = xs_net(x);
- struct esp_tcp_sk *esk;
__be16 sport, dport;
- struct sock *nsk;
struct sock *sk;
- sk = rcu_dereference(x->encap_sk);
- if (sk && sk->sk_state == TCP_ESTABLISHED)
- return sk;
-
spin_lock_bh(&x->lock);
sport = encap->encap_sport;
dport = encap->encap_dport;
- nsk = rcu_dereference_protected(x->encap_sk,
- lockdep_is_held(&x->lock));
- if (sk && sk == nsk) {
- esk = kmalloc(sizeof(*esk), GFP_ATOMIC);
- if (!esk) {
- spin_unlock_bh(&x->lock);
- return ERR_PTR(-ENOMEM);
- }
- RCU_INIT_POINTER(x->encap_sk, NULL);
- esk->sk = sk;
- call_rcu(&esk->rcu, esp_free_tcp_sk);
- }
spin_unlock_bh(&x->lock);
sk = inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, x->id.daddr.a4,
@@ -173,20 +142,6 @@ static struct sock *esp_find_tcp_sk(struct xfrm_state *x)
return ERR_PTR(-EINVAL);
}
- spin_lock_bh(&x->lock);
- nsk = rcu_dereference_protected(x->encap_sk,
- lockdep_is_held(&x->lock));
- if (encap->encap_sport != sport ||
- encap->encap_dport != dport) {
- sock_put(sk);
- sk = nsk ?: ERR_PTR(-EREMCHG);
- } else if (sk == nsk) {
- sock_put(sk);
- } else {
- rcu_assign_pointer(x->encap_sk, sk);
- }
- spin_unlock_bh(&x->lock);
-
return sk;
}
@@ -211,6 +166,8 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb)
err = espintcp_push_skb(sk, skb);
bh_unlock_sock(sk);
+ sock_put(sk);
+
out:
rcu_read_unlock();
return err;
@@ -394,6 +351,8 @@ static struct ip_esp_hdr *esp_output_tcp_encap(struct xfrm_state *x,
if (IS_ERR(sk))
return ERR_CAST(sk);
+ sock_put(sk);
+
*lenp = htons(len);
esph = (struct ip_esp_hdr *)(lenp + 1);
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 574989b82179..72adfc107b55 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -137,47 +137,16 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb)
}
#ifdef CONFIG_INET6_ESPINTCP
-struct esp_tcp_sk {
- struct sock *sk;
- struct rcu_head rcu;
-};
-
-static void esp_free_tcp_sk(struct rcu_head *head)
-{
- struct esp_tcp_sk *esk = container_of(head, struct esp_tcp_sk, rcu);
-
- sock_put(esk->sk);
- kfree(esk);
-}
-
static struct sock *esp6_find_tcp_sk(struct xfrm_state *x)
{
struct xfrm_encap_tmpl *encap = x->encap;
struct net *net = xs_net(x);
- struct esp_tcp_sk *esk;
__be16 sport, dport;
- struct sock *nsk;
struct sock *sk;
- sk = rcu_dereference(x->encap_sk);
- if (sk && sk->sk_state == TCP_ESTABLISHED)
- return sk;
-
spin_lock_bh(&x->lock);
sport = encap->encap_sport;
dport = encap->encap_dport;
- nsk = rcu_dereference_protected(x->encap_sk,
- lockdep_is_held(&x->lock));
- if (sk && sk == nsk) {
- esk = kmalloc(sizeof(*esk), GFP_ATOMIC);
- if (!esk) {
- spin_unlock_bh(&x->lock);
- return ERR_PTR(-ENOMEM);
- }
- RCU_INIT_POINTER(x->encap_sk, NULL);
- esk->sk = sk;
- call_rcu(&esk->rcu, esp_free_tcp_sk);
- }
spin_unlock_bh(&x->lock);
sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &x->id.daddr.in6,
@@ -190,20 +159,6 @@ static struct sock *esp6_find_tcp_sk(struct xfrm_state *x)
return ERR_PTR(-EINVAL);
}
- spin_lock_bh(&x->lock);
- nsk = rcu_dereference_protected(x->encap_sk,
- lockdep_is_held(&x->lock));
- if (encap->encap_sport != sport ||
- encap->encap_dport != dport) {
- sock_put(sk);
- sk = nsk ?: ERR_PTR(-EREMCHG);
- } else if (sk == nsk) {
- sock_put(sk);
- } else {
- rcu_assign_pointer(x->encap_sk, sk);
- }
- spin_unlock_bh(&x->lock);
-
return sk;
}
@@ -228,6 +183,8 @@ static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb)
err = espintcp_push_skb(sk, skb);
bh_unlock_sock(sk);
+ sock_put(sk);
+
out:
rcu_read_unlock();
return err;
@@ -424,6 +381,8 @@ static struct ip_esp_hdr *esp6_output_tcp_encap(struct xfrm_state *x,
if (IS_ERR(sk))
return ERR_CAST(sk);
+ sock_put(sk);
+
*lenp = htons(len);
esph = (struct ip_esp_hdr *)(lenp + 1);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ad2202fa82f3..3acf6c14de08 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -840,9 +840,6 @@ int __xfrm_state_delete(struct xfrm_state *x)
xfrm_nat_keepalive_state_updated(x);
spin_unlock(&net->xfrm.xfrm_state_lock);
- if (x->encap_sk)
- sock_put(rcu_dereference_raw(x->encap_sk));
-
xfrm_dev_state_delete(x);
/* All xfrm_state objects are created by xfrm_state_alloc.
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH ipsec 1/2] espintcp: fix skb leaks
2025-04-09 13:59 ` [PATCH ipsec 1/2] espintcp: fix skb leaks Sabrina Dubroca
@ 2025-04-11 13:58 ` Simon Horman
0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2025-04-11 13:58 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Herbert Xu, Steffen Klassert
On Wed, Apr 09, 2025 at 03:59:56PM +0200, Sabrina Dubroca wrote:
> A few error paths are missing a kfree_skb.
>
> Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak
2025-04-09 13:59 ` [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak Sabrina Dubroca
@ 2025-04-11 13:59 ` Simon Horman
0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2025-04-11 13:59 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Herbert Xu, Steffen Klassert
On Wed, Apr 09, 2025 at 03:59:57PM +0200, Sabrina Dubroca wrote:
> The current scheme for caching the encap socket can lead to reference
> leaks when we try to delete the netns.
>
> The reference chain is: xfrm_state -> enacp_sk -> netns
>
> Since the encap socket is a userspace socket, it holds a reference on
> the netns. If we delete the espintcp state (through flush or
> individual delete) before removing the netns, the reference on the
> socket is dropped and the netns is correctly deleted. Otherwise, the
> netns may not be reachable anymore (if all processes within the ns
> have terminated), so we cannot delete the xfrm state to drop its
> reference on the socket.
>
> This patch results in a small (~2% in my tests) performance
> regression.
>
> A GC-type mechanism could be added for the socket cache, to clear
> references if the state hasn't been used "recently", but it's a lot
> more complex than just not caching the socket.
Less is more :)
> Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH ipsec 0/2] fix some leaks in espintcp
2025-04-09 13:59 [PATCH ipsec 0/2] fix some leaks in espintcp Sabrina Dubroca
2025-04-09 13:59 ` [PATCH ipsec 1/2] espintcp: fix skb leaks Sabrina Dubroca
2025-04-09 13:59 ` [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak Sabrina Dubroca
@ 2025-04-14 10:00 ` Steffen Klassert
2 siblings, 0 replies; 6+ messages in thread
From: Steffen Klassert @ 2025-04-14 10:00 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Herbert Xu
On Wed, Apr 09, 2025 at 03:59:55PM +0200, Sabrina Dubroca wrote:
> kmemleak spotted a few leaks that have been here since the beginning.
>
> Sabrina Dubroca (2):
> espintcp: fix skb leaks
> espintcp: remove encap socket caching to avoid reference leak
Series applied, thanks Sabrina!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-14 10:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-09 13:59 [PATCH ipsec 0/2] fix some leaks in espintcp Sabrina Dubroca
2025-04-09 13:59 ` [PATCH ipsec 1/2] espintcp: fix skb leaks Sabrina Dubroca
2025-04-11 13:58 ` Simon Horman
2025-04-09 13:59 ` [PATCH ipsec 2/2] espintcp: remove encap socket caching to avoid reference leak Sabrina Dubroca
2025-04-11 13:59 ` Simon Horman
2025-04-14 10:00 ` [PATCH ipsec 0/2] fix some leaks in espintcp Steffen Klassert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).