* [PATCH] bpf: Fix sk_psock refcnt leak when receiving message @ 2020-04-25 12:50 Xiyu Yang 2020-04-25 14:24 ` Jakub Sitnicki 0 siblings, 1 reply; 3+ messages in thread From: Xiyu Yang @ 2020-04-25 12:50 UTC (permalink / raw) To: John Fastabend, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer, Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko, KP Singh, netdev, bpf, linux-kernel Cc: yuanxzhang, kjlu, Xiyu Yang, Xin Tan tcp_bpf_recvmsg() invokes sk_psock_get(), which returns a reference of the specified sk_psock object to "psock" with increased refcnt. When tcp_bpf_recvmsg() returns, local variable "psock" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in several exception handling paths of tcp_bpf_recvmsg(). When those error scenarios occur such as "flags" includes MSG_ERRQUEUE, the function forgets to decrease the refcnt increased by sk_psock_get(), causing a refcnt leak. Fix this issue by calling sk_psock_put() when those error scenarios occur. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> --- net/ipv4/tcp_bpf.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 5a05327f97c1..feb6b90672c1 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -265,11 +265,15 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, psock = sk_psock_get(sk); if (unlikely(!psock)) return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); - if (unlikely(flags & MSG_ERRQUEUE)) + if (unlikely(flags & MSG_ERRQUEUE)) { + sk_psock_put(sk, psock); return inet_recv_error(sk, msg, len, addr_len); + } if (!skb_queue_empty(&sk->sk_receive_queue) && - sk_psock_queue_empty(psock)) + sk_psock_queue_empty(psock)) { + sk_psock_put(sk, psock); return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + } lock_sock(sk); msg_bytes_ready: copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); -- 2.7.4 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] bpf: Fix sk_psock refcnt leak when receiving message 2020-04-25 12:50 [PATCH] bpf: Fix sk_psock refcnt leak when receiving message Xiyu Yang @ 2020-04-25 14:24 ` Jakub Sitnicki 2020-04-25 21:17 ` John Fastabend 0 siblings, 1 reply; 3+ messages in thread From: Jakub Sitnicki @ 2020-04-25 14:24 UTC (permalink / raw) To: Xiyu Yang Cc: John Fastabend, Daniel Borkmann, Lorenz Bauer, Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko, KP Singh, netdev, bpf, linux-kernel, yuanxzhang, kjlu, Xin Tan On Sat, Apr 25, 2020 at 02:50 PM CEST, Xiyu Yang wrote: > tcp_bpf_recvmsg() invokes sk_psock_get(), which returns a reference of > the specified sk_psock object to "psock" with increased refcnt. > > When tcp_bpf_recvmsg() returns, local variable "psock" becomes invalid, > so the refcount should be decreased to keep refcount balanced. > > The reference counting issue happens in several exception handling paths > of tcp_bpf_recvmsg(). When those error scenarios occur such as "flags" > includes MSG_ERRQUEUE, the function forgets to decrease the refcnt > increased by sk_psock_get(), causing a refcnt leak. > > Fix this issue by calling sk_psock_put() when those error scenarios > occur. > > Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> > Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> > --- > net/ipv4/tcp_bpf.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > index 5a05327f97c1..feb6b90672c1 100644 > --- a/net/ipv4/tcp_bpf.c > +++ b/net/ipv4/tcp_bpf.c > @@ -265,11 +265,15 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > psock = sk_psock_get(sk); > if (unlikely(!psock)) > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > - if (unlikely(flags & MSG_ERRQUEUE)) > + if (unlikely(flags & MSG_ERRQUEUE)) { > + sk_psock_put(sk, psock); > return inet_recv_error(sk, msg, len, addr_len); > + } > if (!skb_queue_empty(&sk->sk_receive_queue) && > - sk_psock_queue_empty(psock)) > + sk_psock_queue_empty(psock)) { > + sk_psock_put(sk, psock); > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > + } > lock_sock(sk); > msg_bytes_ready: > copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); Thanks for the fix. We can pull up the error queue read handling, that is the `flags & MSG_ERRQUEUE` branch, so that it happens before we grab a psock ref. The effect is the same because now, if we hit the !psock branch, tcp_recvmsg will first check if user wants to read the error queue anyway. That would translate to something like below, in addition to your changes. WDYT? ---8<--- diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 5a05327f97c1..99aa57bd1901 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -262,14 +262,17 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, struct sk_psock *psock; int copied, ret; + if (unlikely(flags & MSG_ERRQUEUE)) + return inet_recv_error(sk, msg, len, addr_len); + psock = sk_psock_get(sk); if (unlikely(!psock)) return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); - if (unlikely(flags & MSG_ERRQUEUE)) - return inet_recv_error(sk, msg, len, addr_len); if (!skb_queue_empty(&sk->sk_receive_queue) && sk_psock_queue_empty(psock)) return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] bpf: Fix sk_psock refcnt leak when receiving message 2020-04-25 14:24 ` Jakub Sitnicki @ 2020-04-25 21:17 ` John Fastabend 0 siblings, 0 replies; 3+ messages in thread From: John Fastabend @ 2020-04-25 21:17 UTC (permalink / raw) To: Jakub Sitnicki, Xiyu Yang Cc: John Fastabend, Daniel Borkmann, Lorenz Bauer, Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko, KP Singh, netdev, bpf, linux-kernel, yuanxzhang, kjlu, Xin Tan Jakub Sitnicki wrote: > On Sat, Apr 25, 2020 at 02:50 PM CEST, Xiyu Yang wrote: > > tcp_bpf_recvmsg() invokes sk_psock_get(), which returns a reference of > > the specified sk_psock object to "psock" with increased refcnt. > > > > When tcp_bpf_recvmsg() returns, local variable "psock" becomes invalid, > > so the refcount should be decreased to keep refcount balanced. > > > > The reference counting issue happens in several exception handling paths > > of tcp_bpf_recvmsg(). When those error scenarios occur such as "flags" > > includes MSG_ERRQUEUE, the function forgets to decrease the refcnt > > increased by sk_psock_get(), causing a refcnt leak. > > > > Fix this issue by calling sk_psock_put() when those error scenarios > > occur. > > > > Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> > > Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> > > --- Thanks Xiyu and Xin. Please add a Fixes tag, Fixes: e7a5f1f1cd000 ("bpf/sockmap: Read psock ingress_msg before sk_receive_queue") > > net/ipv4/tcp_bpf.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > > index 5a05327f97c1..feb6b90672c1 100644 > > --- a/net/ipv4/tcp_bpf.c > > +++ b/net/ipv4/tcp_bpf.c > > @@ -265,11 +265,15 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > > psock = sk_psock_get(sk); > > if (unlikely(!psock)) > > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > > - if (unlikely(flags & MSG_ERRQUEUE)) > > + if (unlikely(flags & MSG_ERRQUEUE)) { > > + sk_psock_put(sk, psock); > > return inet_recv_error(sk, msg, len, addr_len); > > + } > > if (!skb_queue_empty(&sk->sk_receive_queue) && > > - sk_psock_queue_empty(psock)) > > + sk_psock_queue_empty(psock)) { > > + sk_psock_put(sk, psock); > > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > > + } > > lock_sock(sk); > > msg_bytes_ready: > > copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); > > Thanks for the fix. > > We can pull up the error queue read handling, that is the `flags & > MSG_ERRQUEUE` branch, so that it happens before we grab a psock ref. > > The effect is the same because now, if we hit the !psock branch, > tcp_recvmsg will first check if user wants to read the error queue > anyway. > > That would translate to something like below, in addition to your > changes. > > WDYT? Sure that seems slightly nicer to me. Xiyu do you mind sending a v2 with that change. Thanks again, John > > ---8<--- > > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > index 5a05327f97c1..99aa57bd1901 100644 > --- a/net/ipv4/tcp_bpf.c > +++ b/net/ipv4/tcp_bpf.c > @@ -262,14 +262,17 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > struct sk_psock *psock; > int copied, ret; > > + if (unlikely(flags & MSG_ERRQUEUE)) > + return inet_recv_error(sk, msg, len, addr_len); > + > psock = sk_psock_get(sk); > if (unlikely(!psock)) > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > - if (unlikely(flags & MSG_ERRQUEUE)) > - return inet_recv_error(sk, msg, len, addr_len); > if (!skb_queue_empty(&sk->sk_receive_queue) && > sk_psock_queue_empty(psock)) > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-04-25 21:17 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-25 12:50 [PATCH] bpf: Fix sk_psock refcnt leak when receiving message Xiyu Yang 2020-04-25 14:24 ` Jakub Sitnicki 2020-04-25 21:17 ` John Fastabend
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).