netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Daniel Borkmann <dborkman@redhat.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-sctp@vger.kernel.org
Subject: Re: [PATCH net-next 4/5] net: sctp: decouple cleaning socket data from endpoint
Date: Tue, 18 Jun 2013 14:15:21 -0400	[thread overview]
Message-ID: <51C0A3B9.80905@gmail.com> (raw)
In-Reply-To: <20130618174503.GB27099@hmsreliant.think-freely.org>

On 06/18/2013 01:45 PM, Neil Horman wrote:
> On Tue, Jun 18, 2013 at 12:24:31PM -0400, Vlad Yasevich wrote:
>> On 06/18/2013 12:02 PM, Daniel Borkmann wrote:
>>> On 06/18/2013 04:22 PM, Neil Horman wrote:
>>>> I like this idea, but I think I'm maybe missing something from it - we
>>>> reference
>>>> the socket in both the receive and send paths (sctp_unpack_cookie, is
>>>> specifically called from the rx path, which makes use of sp->hmac).  a
>>>> socket
>>>> destructor can be called from __sk_free when sk_wmem_alloc reaches
>>>> zero, but we
>>>> use sk_refcnt in the rx path to prevent premature socket cleanup.  If
>>>> we drain
>>>> our send queeue while wer'e still processing rx messages, what
>>>> prevents us from
>>>> freeing the socket in the tx path, via sk_free while we're still using
>>>> the
>>>> socket in the rx path.  Note I don't think this patch is wrong per-se,
>>>> but it
>>>> seems to me there is more work to do to properly interlock the use of
>>>> sk_refcnt
>>>> and sk_wmem_alloc here (unless I'm just missing something obvious,
>>>> which is
>>>> entirely possible, I've been in the sun alot lately :) ).
>>>
>>> Hm, __sk_free() calls sk_prot_free() which frees our socket structure
>>> and in
>>> sctp_wfree() we do a sctp_association_put(asoc) after sock_wfree(skb).
>>>
>>> So no matter if having this patch or not, couldn't this use-after-free like
>>> scenario already happen with the current code?
>>>
>>> F.e. through a given call graph like that:
>>>
>>> sctp_wfree(skb):
>>>   1) sock_wfree(skb)
>>>      -> __sk_free()
>>
>> I don't think this can happen.  sk_wmem_alloc is set to 1 in sk_alloc()
>> and that acts as a guard to make sure that sk_free() has been called
>> before we try to free things up.  So, in this partcular case, for
>> __sk_free() to be called, sk_free() had to be called meaning the
>> last ref on the socket was released.  However, that's not possible since
>> we are still holding the association and thus holding the socket
>> associated with it.
>>
> I see what your saying, and I agree, with that bias added in sk_alloc, it looks
> like we won't ever call __sk_free until sk_wmem_alloc is 1 _and_ sk_refcnt is 0.
> It still seems messy and confusing though.  It would make more sense to me to
> increment the refcount an additional time when the socket is initalized, and
> then decrement it again when the socket is closed and sk_wmem_alloc reaches
> zero.  That would isolate the refcounting to a single variable.

See commit 2b85a34e911bf483c27cfdd124aeb1605145dc80.  The whole 
sk_wmem_alloc tric was done so that we dont have to do 
sock_hold/sock_put on transmits.

It might be good to see if we can do that in sctp as well.

-vlad


> Neil
>
>> -vlad
>>
>>>       -> sk_prot_free(.., sk)
>>>        -> kmem_cache_free(.., sk) or kfree(sk)
>>>   2) __sctp_write_space(asoc)
>>>   3) sctp_association_put(asoc)
>>>      -> sctp_association_destroy(asoc)
>>>       -> sctp_endpoint_put(asoc->ep)
>>>        -> sctp_endpoint_destroy(ep)
>>>         -> crypto_free_hash(sctp_sk(ep->base.sk)->hmac)
>>>            (etc, all unconditionally accessed while sk is
>>>             already dead/freed)
>>>
>>> Then, this might need a fix in general. :-)
>>>
>>> Assuming you would reduce the buffer space via setsockopt(.., SO_SNDBUF,
>>> ..),
>>> you might end up with a minimum buffer space of SOCK_MIN_SNDBUF [*] and
>>> a call to
>>> sk->sk_write_space(sk), which is sctp_write_space() and calls
>>> __sctp_write_space()
>>> on all asocs belonging to the socket, but it seems not to alter the current
>>> sk->sk_wmem_alloc I think, but rather sk->sk_sndbuf.
>>>
>>> [*] Btw, shouldn't this rather be (2048 + sizeof(struct sk_buff)) or
>>>      SKB_TRUESIZE(2048), at least like in SOCK_MIN_RCVBUF since we operate
>>>      on skb->truesize as well?
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

  reply	other threads:[~2013-06-18 18:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18  8:55 [PATCH net-next 0/5] Further SCTP changes Daniel Borkmann
2013-06-18  8:55 ` [PATCH net-next 1/5] net: sctp: remove TEST_FRAME ifdef Daniel Borkmann
2013-06-18  8:55 ` [PATCH net-next 2/5] ktime: add ms_to_ktime() and ktime_add_ms() helpers Daniel Borkmann
2013-06-18  8:55 ` [PATCH net-next 3/5] net: sctp: migrate cookie life from timeval to ktime Daniel Borkmann
2013-06-18  8:55 ` [PATCH net-next 4/5] net: sctp: decouple cleaning socket data from endpoint Daniel Borkmann
2013-06-18 14:22   ` Neil Horman
2013-06-18 16:02     ` Daniel Borkmann
2013-06-18 16:24       ` Vlad Yasevich
2013-06-18 17:45         ` Neil Horman
2013-06-18 18:15           ` Vlad Yasevich [this message]
2013-06-18 19:12             ` Neil Horman
2013-06-18 22:55               ` Vlad Yasevich
2013-06-19 11:55                 ` Neil Horman
2013-06-20 14:35   ` Vlad Yasevich
2013-06-20 17:29     ` Daniel Borkmann
2013-06-21  1:12       ` Vlad Yasevich
2013-06-22 21:38         ` Daniel Borkmann
2013-06-18  8:55 ` [PATCH net-next 5/5] net: sctp: minor: sctp_seq_dump_local_addrs add missing newline Daniel Borkmann
2013-06-20  1:39 ` [PATCH net-next 0/5] Further SCTP changes David Miller
2013-06-20 13:24   ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C0A3B9.80905@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=linux-sctp@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).