* [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
@ 2019-03-08 21:09 Guillaume Nault
2019-03-08 21:33 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Nault @ 2019-03-08 21:09 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet
Commit 7716682cc58e ("tcp/dccp: fix another race at listener
dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
{tcp,dccp}_check_req() accordingly. However, TFO and syncookies
weren't modified, thus leaking allocated resources on error.
Contrary to tcp_check_req(), in both syncookies and TFO cases,
we need to drop the request socket. Also, since the child socket is
created with inet_csk_clone_lock(), we have to unlock it and drop an
extra reference (->sk_refcount is initially set to 2 and
inet_csk_reqsk_queue_add() drops only one ref).
For TFO, we also need to revert the work done by tcp_try_fastopen()
(with reqsk_fastopen_remove()).
Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
Note for stable backports: this patch relies on da8ab57863ed
("tcp/dccp: remove reqsk_put() from inet_child_forget()"), to prevent
inet_child_forget() from dropping a reference from the request socket.
Therefore, for trees older than 4.14, commit da8ab57863ed has to be
backported before this patch.
net/ipv4/syncookies.c | 7 ++++++-
net/ipv4/tcp_input.c | 8 +++++++-
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 606f868d9f3f..e531344611a0 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
refcount_set(&req->rsk_refcnt, 1);
tcp_sk(child)->tsoffset = tsoff;
sock_rps_save_rxhash(child, skb);
- inet_csk_reqsk_queue_add(sk, req, child);
+ if (!inet_csk_reqsk_queue_add(sk, req, child)) {
+ bh_unlock_sock(child);
+ sock_put(child);
+ child = NULL;
+ reqsk_put(req);
+ }
} else {
reqsk_free(req);
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4eb0c8ca3c60..5def3c48870e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6498,7 +6498,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
af_ops->send_synack(fastopen_sk, dst, &fl, req,
&foc, TCP_SYNACK_FASTOPEN);
/* Add the child socket directly into the accept queue */
- inet_csk_reqsk_queue_add(sk, req, fastopen_sk);
+ if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) {
+ reqsk_fastopen_remove(fastopen_sk, req, false);
+ bh_unlock_sock(fastopen_sk);
+ sock_put(fastopen_sk);
+ reqsk_put(req);
+ goto drop;
+ }
sk->sk_data_ready(sk);
bh_unlock_sock(fastopen_sk);
sock_put(fastopen_sk);
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 21:09 [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures Guillaume Nault @ 2019-03-08 21:33 ` Eric Dumazet 2019-03-08 22:22 ` Guillaume Nault 0 siblings, 1 reply; 8+ messages in thread From: Eric Dumazet @ 2019-03-08 21:33 UTC (permalink / raw) To: Guillaume Nault, netdev On 03/08/2019 01:09 PM, Guillaume Nault wrote: > Commit 7716682cc58e ("tcp/dccp: fix another race at listener > dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted > {tcp,dccp}_check_req() accordingly. However, TFO and syncookies > weren't modified, thus leaking allocated resources on error. > > Contrary to tcp_check_req(), in both syncookies and TFO cases, > we need to drop the request socket. Also, since the child socket is > created with inet_csk_clone_lock(), we have to unlock it and drop an > extra reference (->sk_refcount is initially set to 2 and > inet_csk_reqsk_queue_add() drops only one ref). > > For TFO, we also need to revert the work done by tcp_try_fastopen() > (with reqsk_fastopen_remove()). > > Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle") > Signed-off-by: Guillaume Nault <gnault@redhat.com> > --- > > Note for stable backports: this patch relies on da8ab57863ed > ("tcp/dccp: remove reqsk_put() from inet_child_forget()"), to prevent > inet_child_forget() from dropping a reference from the request socket. > > Therefore, for trees older than 4.14, commit da8ab57863ed has to be > backported before this patch. > Thanks for working on this issue (it was on my radar as well) > > net/ipv4/syncookies.c | 7 ++++++- > net/ipv4/tcp_input.c | 8 +++++++- > 2 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c > index 606f868d9f3f..e531344611a0 100644 > --- a/net/ipv4/syncookies.c > +++ b/net/ipv4/syncookies.c > @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, > refcount_set(&req->rsk_refcnt, 1); > tcp_sk(child)->tsoffset = tsoff; > sock_rps_save_rxhash(child, skb); > - inet_csk_reqsk_queue_add(sk, req, child); > + if (!inet_csk_reqsk_queue_add(sk, req, child)) { > + bh_unlock_sock(child); > + sock_put(child); > + child = NULL; > + reqsk_put(req); Since we use reqsk_free(req) in the same function, we can use reqsk_free(req) here as well ? I suggest the following maybe : diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 606f868d9f3fde1c3140aa7eecde87d2ec32b5f2..8b28fb66a8fcefba27a2f5e371e9469d4d7e3650 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -216,11 +216,14 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, refcount_set(&req->rsk_refcnt, 1); tcp_sk(child)->tsoffset = tsoff; sock_rps_save_rxhash(child, skb); - inet_csk_reqsk_queue_add(sk, req, child); - } else { - reqsk_free(req); + if (likely(inet_csk_reqsk_queue_add(sk, req, child))) + return child; + bh_unlock_sock(child); + sock_put(child); } - return child; + + reqsk_free(req); + return NULL; } EXPORT_SYMBOL(tcp_get_cookie_sock); > + } > } else { > reqsk_free(req); > } > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 4eb0c8ca3c60..5def3c48870e 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -6498,7 +6498,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > af_ops->send_synack(fastopen_sk, dst, &fl, req, > &foc, TCP_SYNACK_FASTOPEN); > /* Add the child socket directly into the accept queue */ > - inet_csk_reqsk_queue_add(sk, req, fastopen_sk); > + if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) { > + reqsk_fastopen_remove(fastopen_sk, req, false); > + bh_unlock_sock(fastopen_sk); > + sock_put(fastopen_sk); > + reqsk_put(req); > + goto drop; These two lines can be replaced by : goto drop_and_free; > + } > sk->sk_data_ready(sk); > bh_unlock_sock(fastopen_sk); > sock_put(fastopen_sk); > ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 21:33 ` Eric Dumazet @ 2019-03-08 22:22 ` Guillaume Nault 2019-03-08 22:34 ` Eric Dumazet 0 siblings, 1 reply; 8+ messages in thread From: Guillaume Nault @ 2019-03-08 22:22 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: > > > On 03/08/2019 01:09 PM, Guillaume Nault wrote: > > @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, > > refcount_set(&req->rsk_refcnt, 1); > > tcp_sk(child)->tsoffset = tsoff; > > sock_rps_save_rxhash(child, skb); > > - inet_csk_reqsk_queue_add(sk, req, child); > > + if (!inet_csk_reqsk_queue_add(sk, req, child)) { > > + bh_unlock_sock(child); > > + sock_put(child); > > + child = NULL; > > + reqsk_put(req); > > Since we use reqsk_free(req) in the same function, we can use reqsk_free(req) > here as well ? > That was my first approach, but reqsk_free() doesn't like it: static inline void reqsk_free(struct request_sock *req) { /* temporary debugging */ WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0); ... } > I suggest the following maybe : > > diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c > index 606f868d9f3fde1c3140aa7eecde87d2ec32b5f2..8b28fb66a8fcefba27a2f5e371e9469d4d7e3650 100644 > --- a/net/ipv4/syncookies.c > +++ b/net/ipv4/syncookies.c > @@ -216,11 +216,14 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, > refcount_set(&req->rsk_refcnt, 1); > tcp_sk(child)->tsoffset = tsoff; > sock_rps_save_rxhash(child, skb); > - inet_csk_reqsk_queue_add(sk, req, child); > - } else { > - reqsk_free(req); > + if (likely(inet_csk_reqsk_queue_add(sk, req, child))) > + return child; > + bh_unlock_sock(child); > + sock_put(child); > } > - return child; > + > + reqsk_free(req); > + return NULL; > } > EXPORT_SYMBOL(tcp_get_cookie_sock); > > I prefer this form as well, but I'm not sure if removing the "temporary" WARN() is appropriate for -net. If it is, I'll resubmit. Otherwise I can refactor it after net-next reopens. Any opinion? Guillaume ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 22:22 ` Guillaume Nault @ 2019-03-08 22:34 ` Eric Dumazet 2019-03-08 22:40 ` Guillaume Nault 0 siblings, 1 reply; 8+ messages in thread From: Eric Dumazet @ 2019-03-08 22:34 UTC (permalink / raw) To: Guillaume Nault; +Cc: netdev On 03/08/2019 02:22 PM, Guillaume Nault wrote: > On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: >> >> >> On 03/08/2019 01:09 PM, Guillaume Nault wrote: >>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, >>> refcount_set(&req->rsk_refcnt, 1); >>> tcp_sk(child)->tsoffset = tsoff; >>> sock_rps_save_rxhash(child, skb); >>> - inet_csk_reqsk_queue_add(sk, req, child); >>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) { >>> + bh_unlock_sock(child); >>> + sock_put(child); >>> + child = NULL; >>> + reqsk_put(req); >> >> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req) >> here as well ? >> > That was my first approach, but reqsk_free() doesn't like it: > > static inline void reqsk_free(struct request_sock *req) > { > /* temporary debugging */ > WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0); > ... > } Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call to inet_csk_reqsk_queue_add(sk, req, child); So just change the TFO case only :) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 22:34 ` Eric Dumazet @ 2019-03-08 22:40 ` Guillaume Nault 2019-03-08 23:47 ` Eric Dumazet 0 siblings, 1 reply; 8+ messages in thread From: Guillaume Nault @ 2019-03-08 22:40 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote: > > > On 03/08/2019 02:22 PM, Guillaume Nault wrote: > > On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: > >> > >> > >> On 03/08/2019 01:09 PM, Guillaume Nault wrote: > >>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, > >>> refcount_set(&req->rsk_refcnt, 1); > >>> tcp_sk(child)->tsoffset = tsoff; > >>> sock_rps_save_rxhash(child, skb); > >>> - inet_csk_reqsk_queue_add(sk, req, child); > >>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) { > >>> + bh_unlock_sock(child); > >>> + sock_put(child); > >>> + child = NULL; > >>> + reqsk_put(req); > >> > >> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req) > >> here as well ? > >> > > That was my first approach, but reqsk_free() doesn't like it: > > > > static inline void reqsk_free(struct request_sock *req) > > { > > /* temporary debugging */ > > WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0); > > ... > > } > > Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call > to inet_csk_reqsk_queue_add(sk, req, child); > > So just change the TFO case only :) > Well.. refcount is 1 in the TFO case too. Long term, do we want to keep the WARN_ON_ONCE()? If so, we should probably remove the comment. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 22:40 ` Guillaume Nault @ 2019-03-08 23:47 ` Eric Dumazet 2019-03-09 0:06 ` David Miller 2019-03-09 9:02 ` Guillaume Nault 0 siblings, 2 replies; 8+ messages in thread From: Eric Dumazet @ 2019-03-08 23:47 UTC (permalink / raw) To: Guillaume Nault, Eric Dumazet; +Cc: netdev On 03/08/2019 02:40 PM, Guillaume Nault wrote: > On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote: >> >> >> On 03/08/2019 02:22 PM, Guillaume Nault wrote: >>> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: >>>> >>>> >>>> On 03/08/2019 01:09 PM, Guillaume Nault wrote: >>>>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, >>>>> refcount_set(&req->rsk_refcnt, 1); >>>>> tcp_sk(child)->tsoffset = tsoff; >>>>> sock_rps_save_rxhash(child, skb); >>>>> - inet_csk_reqsk_queue_add(sk, req, child); >>>>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) { >>>>> + bh_unlock_sock(child); >>>>> + sock_put(child); >>>>> + child = NULL; >>>>> + reqsk_put(req); >>>> >>>> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req) >>>> here as well ? >>>> >>> That was my first approach, but reqsk_free() doesn't like it: >>> >>> static inline void reqsk_free(struct request_sock *req) >>> { >>> /* temporary debugging */ >>> WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0); >>> ... >>> } >> >> Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call >> to inet_csk_reqsk_queue_add(sk, req, child); >> >> So just change the TFO case only :) >> > Well.. refcount is 1 in the TFO case too. Arg... > > Long term, do we want to keep the WARN_ON_ONCE()? If so, we should > probably remove the comment. We want to keep the warning. We do not have a way to tell if the req was ever inserted in a hash table, so better play safe. Signed-off-by: Eric Dumazet <edumazet@google.com> Thanks ! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 23:47 ` Eric Dumazet @ 2019-03-09 0:06 ` David Miller 2019-03-09 9:02 ` Guillaume Nault 1 sibling, 0 replies; 8+ messages in thread From: David Miller @ 2019-03-09 0:06 UTC (permalink / raw) To: eric.dumazet; +Cc: gnault, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 8 Mar 2019 15:47:25 -0800 > Signed-off-by: Eric Dumazet <edumazet@google.com> Applied and queued up for -stable. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures 2019-03-08 23:47 ` Eric Dumazet 2019-03-09 0:06 ` David Miller @ 2019-03-09 9:02 ` Guillaume Nault 1 sibling, 0 replies; 8+ messages in thread From: Guillaume Nault @ 2019-03-09 9:02 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Fri, Mar 08, 2019 at 03:47:25PM -0800, Eric Dumazet wrote: > > On 03/08/2019 02:40 PM, Guillaume Nault wrote: > > On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote: > > > > Long term, do we want to keep the WARN_ON_ONCE()? If so, we should > > probably remove the comment. > > We want to keep the warning. > > We do not have a way to tell if the req was ever inserted in a hash table, so better play safe. > Then I'm going to remove the /* temporary debugging */ line, so that nobody will be tempted to drop the test. Thanks for your feedbacks. Guillaume ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-03-09 9:02 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-08 21:09 [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures Guillaume Nault 2019-03-08 21:33 ` Eric Dumazet 2019-03-08 22:22 ` Guillaume Nault 2019-03-08 22:34 ` Eric Dumazet 2019-03-08 22:40 ` Guillaume Nault 2019-03-08 23:47 ` Eric Dumazet 2019-03-09 0:06 ` David Miller 2019-03-09 9:02 ` Guillaume Nault
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).