From: Vlad Yasevich <vyasevich@gmail.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
netdev@vger.kernel.org
Cc: linux-sctp@vger.kernel.org, dvyukov@google.com,
eric.dumazet@gmail.com, syzkaller@googlegroups.com,
kcc@google.com, glider@google.com, sasha.levin@oracle.com
Subject: Re: [PATCH net] sctp: do sanity checks before migrating the asoc
Date: Tue, 19 Jan 2016 13:37:52 -0500 [thread overview]
Message-ID: <569E8280.9080701@gmail.com> (raw)
In-Reply-To: <569E5D44.5000103@gmail.com>
On 01/19/2016 10:59 AM, Marcelo Ricardo Leitner wrote:
> Em 19-01-2016 12:19, Vlad Yasevich escreveu:
>> On 01/15/2016 04:40 PM, Marcelo Ricardo Leitner wrote:
>>> On Fri, Jan 15, 2016 at 08:11:03PM +0100, Dmitry Vyukov wrote:
>>>> On Fri, Jan 15, 2016 at 7:46 PM, Marcelo Ricardo Leitner
>>>> <marcelo.leitner@gmail.com> wrote:
>>>>> On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote:
>>>>>> Hello,
>>>>>>
>>>>>> The following program leads to a leak of two sock objects:
>>>>> ...
>>>>>>
>>>>>> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28).
>>>>>
>>>>> I'm afraid I cannot reproduce this one?
>>>>> I enabled dynprintk at sctp_destroy_sock and it does print twice when I
>>>>> run this test app.
>>>>> Also added debugs to check association lifetime, and then it was
>>>>> destroyed. Same for endpoint.
>>>>>
>>>>> Checking with trace-cmd, both calls to sctp_close() resulted in
>>>>> sctp_destroy_sock() being called.
>>>>>
>>>>> As for sock_hold/put, they are matched too.
>>>>>
>>>>> Ideas? Log is below for double checking
>>>>
>>>>
>>>> Hummm... I can reproduce it pretty reliably.
>>>>
>>>> [ 197.459024] kmemleak: 11 new suspected memory leaks (see
>>>> /sys/kernel/debug/kmemleak)
>>>> [ 307.494874] kmemleak: 409 new suspected memory leaks (see
>>>> /sys/kernel/debug/kmemleak)
>>>> [ 549.784022] kmemleak: 125 new suspected memory leaks (see
>>>> /sys/kernel/debug/kmemleak)
>>>>
>>>> I double checked via /proc/slabinfo:
>>>>
>>>> SCTPv6 4373 4420 2368 13 8 : tunables 0 0
>>>> 0 : slabdata 340 340 0
>>>>
>>>> SCTPv6 starts with almost 0, but grows infinitely while I run the
>>>> program in a loop.
>>>>
>>>> Here is my SCTP related configs:
>>>>
>>>> CONFIG_IP_SCTP=y
>>>> CONFIG_NET_SCTPPROBE=y
>>>> CONFIG_SCTP_DBG_OBJCNT=y
>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set
>>>> CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y
>>>> # CONFIG_SCTP_COOKIE_HMAC_MD5 is not set
>>>> # CONFIG_SCTP_COOKIE_HMAC_SHA1 is not set
>>>>
>>>> I am on commit 67990608c8b95d2b8ccc29932376ae73d5818727 and I don't
>>>> seem to have any sctp-related changes on top.
>>>
>>> Ok, now I can. Enabled slub debugs, now I cannot see calls to
>>> sctp_destroy_sock. I see to sctp_close, but not to sctp_destroy_sock.
>>>
>>> And SCTPv6 grew by 2 sockets after the execution.
>>>
>>> Further checking, it's a race within SCTP asoc migration:
>>> thread 0 thread 1
>>> - app creates a sock
>>> - sends a packet to itself
>>> - sctp will create an asoc and do implicit
>>> handshake
>>> - send the packet
>>> - listen()
>>> - accept() is called and
>>> that asoc is migrated
>>> - packet is delivered
>>> - skb->destructor is called, BUT:
>>>
>>> (note that if accept() is called after packet is delivered and skb is freed, it
>>> doesn't happen)
>>>
>>> static void sctp_wfree(struct sk_buff *skb)
>>> {
>>> struct sctp_chunk *chunk = skb_shinfo(skb)->destructor_arg;
>>> struct sctp_association *asoc = chunk->asoc;
>>> struct sock *sk = asoc->base.sk;
>>> ...
>>> atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);
>>>
>>> and it's pointing to the new socket already. So one socket gets a leak
>>> on sk_wmem_alloc and another gets a negative value:
>>>
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -1537,12 +1537,14 @@ static void sctp_close(struct sock *sk, long timeout)
>>> /* Hold the sock, since sk_common_release() will put sock_put()
>>> * and we have just a little more cleanup.
>>> */
>>> + printk("%s sock_hold %p\n", __func__, sk);
>>> sock_hold(sk);
>>> sk_common_release(sk);
>>>
>>> bh_unlock_sock(sk);
>>> spin_unlock_bh(&net->sctp.addr_wq_lock);
>>>
>>> + printk("%s sock_put %p %d %d\n", __func__, sk, atomic_read(&sk->sk_refcnt),
>>> atomic_read(&sk->sk_wmem_alloc));
>>> sock_put(sk);
>>>
>>> SCTP_DBG_OBJCNT_DEC(sock);
>>>
>>>
>>> gave me:
>>>
>>> [ 99.456944] sctp_close sock_hold ffff880137df8940
>>> ...
>>> [ 99.457337] sctp_close sock_put ffff880137df8940 1 -247
>>> [ 99.458313] sctp_close sock_hold ffff880137dfef00
>>> ...
>>> [ 99.458383] sctp_close sock_put ffff880137dfef00 1 249
>>>
>>> That's why the socket is not freed..
>>>
>>
>> Interesting... sctp_sock_migrate() accounts for this race in the
>> receive buffer, but not the send buffer.
>>
>> On the one hand I am not crazy about the connect-to-self scenario.
>> On the other, I think to support it correctly, we should support
>> skb migrations for the send case just like we do the receive case.
>
>
> Yes, not thrilled here either about connect-to-self.
>
> But there is a big difference on how both works. For rx we can just look for wanted skbs
> in rx queue, as they aren't going anywhere, but for tx I don't think we can easily block
> sctp_wfree() call because that may be happening on another CPU (or am I mistaken here?
> sctp still doesn't have RFS but even irqbalance could affect this AFAICT) and more than
> one skb may be in transit at a time.
The way it's done now, we wouldn't have to block sctp_wfree. Chunks are released under
lock when they are acked, so we are OK here. The tx completions will just put 1 byte back
to the socket associated with the tx'ed skb, and that should still be ok as
sctp_packet_release_owner will call sk_free().
> The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets
> say wouldn't be benefit from it. Considering the possible migration, as we can't trust
> chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would
> have to go on taking locks one by one on tx queue for the migration. Ugh ;)
>
No, the chunks manipulation is done under the socket locket so I don't think we have to
worry about a per chunk lock. We should be able to trust chunk->asoc pointer always
because each chunk holds a ref on the association. The only somewhat ugly thing
about moving tx chunks is that you have to potentially walk a lot of lists to move
things around. There are all the lists in the sctp_outqueue struct, plus the
per-transport retransmit list...
Even though the above seems to be a PITA, my main reason for recommending this is
that can happen in normal situations too. Consider a very busy association that is
transferring a lot of a data on a 1-to-many socket. The app decides to move do a
peel-off, and we could now be stuck not being able to peel-off for a quite a while
if there is a hick-up in the network and we have to rtx multiple times.
-vlad
> Marcelo
>
next prev parent reply other threads:[~2016-01-19 18:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-30 20:42 net/sctp: sock memory leak Dmitry Vyukov
2015-12-30 20:47 ` Marcelo Ricardo Leitner
2016-01-15 18:46 ` Marcelo Ricardo Leitner
2016-01-15 19:11 ` Dmitry Vyukov
2016-01-15 21:40 ` [PATCH net] sctp: do sanity checks before migrating the asoc Marcelo Ricardo Leitner
2016-01-19 14:19 ` Vlad Yasevich
2016-01-19 15:59 ` Marcelo Ricardo Leitner
2016-01-19 18:37 ` Vlad Yasevich [this message]
2016-01-19 19:31 ` Marcelo Ricardo Leitner
2016-01-19 19:55 ` Vlad Yasevich
2016-01-19 20:08 ` Marcelo Ricardo Leitner
2016-02-03 16:13 ` Dmitry Vyukov
2016-02-04 9:47 ` Marcelo Ricardo Leitner
2016-03-02 8:56 ` net/sctp: sock memory leak Dmitry Vyukov
2016-03-02 19:42 ` Marcelo Ricardo Leitner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=569E8280.9080701@gmail.com \
--to=vyasevich@gmail.com \
--cc=dvyukov@google.com \
--cc=eric.dumazet@gmail.com \
--cc=glider@google.com \
--cc=kcc@google.com \
--cc=linux-sctp@vger.kernel.org \
--cc=marcelo.leitner@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=sasha.levin@oracle.com \
--cc=syzkaller@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).