All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vladislav.yasevich@hp.com>
To: linux-sctp@vger.kernel.org
Subject: Re: BUG in sctp crashes sles10sp2 kernel
Date: Mon, 15 Dec 2008 15:38:56 +0000	[thread overview]
Message-ID: <49467A10.8030802@hp.com> (raw)
In-Reply-To: <20081211145209.GB5236@dhcp35.suse.cz>

Karsten Keil wrote:
> Hi Vlad,
> 
> On Thu, Dec 11, 2008 at 10:28:35AM -0500, Vlad Yasevich wrote:
>> Michal Hocko wrote:
>>> Hi Vlad,
>>>
>>> I am starting this new thread because I am starting to believe that
>>> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
>>> issue than we can see in the upstream kernel (see bellow).
>>>
>>> Karsten (CCing him) has found out following:
>>> "
>>> OK I think the
>>> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
>>> net/ipv4/af_inet.c (149)
>>>
>>> is related to the main problem here, it says that on the time a socket
>>> get destroyed here is still some wmem allocated. This mean here is still
>>> a transmit skb on the fly. Since sctp use skb destructors to do the
>>> memory accounting, this also means that after destroying the socket, the
>>> destructor of this skb will access the already freed socket struct,
>>> which will let in some cases (if the memory is in use again and the
>>> pointers are already overwritten) cause the crash with on
>>> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
>>> it can crash in every other place, since the accounting may overwrite
>>> pointers in any other struct, which reuse this memory.
>>>
>>> I instrument some routines with extra debug (eg. inet_sock_destruct) too
>>> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
>>>
>>> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
>>> sk(ffff810116960e00)->sk_wmem_alloc 496
>>> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
>>> sk(ffff8101144f1b00)->sk_wmem_alloc 496
>>> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
>>> sk(ffff8101144f1b00)->sk_wmem_alloc -496
>>> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
>>> sk(ffff81011d461a00)->sk_wmem_alloc 496
>>> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
>>> sk(ffff81011d460080)->sk_wmem_alloc 496
>>>
>>> Note the -496, I think this is a case in which the same memory was again
>>> allocated by a socket struct, so the memory still has valid pointers and
>>> so on the destructor call for the old socket it did decrement the memory
>>> on the new socket.
>>>
>>> Do you agree with this analysis ?
>>> "
>>>
>>> I am trying to go through git logs but maybe you remember some fix in
>>> this area.
>>>
>>> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
>>> removes destructors from sctp completely, so the previous should not
>>> happen in upstream, shouldn't it?
>>>
>>
>> Here are a few commits that you need to check on:
>>
>> 61c9fed41638249f8b6ca5345064eb1beb50179f
>> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
>>
>> cfdeef3282705a4b872d3559c4e7d2561251363c
>> [SCTP]: Unhash the endpoint in sctp_endpoint_free().
>>
>> f26f7c480555812ca7c4037e0a50fa54afe2cb4a
>> [SCTP]: Add bind hash locking to the migrate code
>>
>>
>> All of the above commits address races in the SCTP code and are not in the base
>> 2.6.16 kernel.
>>
> 
> Thanks for your input.
> 
> 61c9fed41638249f8b6ca5345064eb1beb50179f
> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
> 
> seems to fix this issue, I applied also the other patches.
> 
> Now I do not get any longer the "KERNEL: assertion
> (!atomic_read(&sk->sk_wmem_alloc)) failed ..." messages.
> 
> But now I run into the skb_overflow BUG.
> With some extra debug (based on your debug patch) I see:
> 
> Possible SKB overflow: packet size = 76, packet overhead = 32, packet chunk = 1/4, chunk len \x1040 packet padding 0 nskb len 12 mtu = 1500
> 
> packet chunk = 1/4 read as first chunk of total 4 chunks cause the overflow.

OK.  It appears that the list of chunk on the packets is not cleaned up
correctly.  It could also be that in my prior debug patches I didn't reset
the number of chunks properly.

Since this is the same bug that I am trying to diagnose in 2.6.27+, it now
boils down to the same problem.  Either the packet is not cleaned up
correctly, or there is a race wrt packet chunk list.

I am currently somewhat off-line due to power issues at my house, but I'll
see if I can get a somewhat cleaned-up debug patch to you.

-vlad
> 
> First I was thinking that maybe the padding cause this, so I also print this
> value, but it is 0 in all traces.
> 
> I also applied
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h¹0a137d30a6322d76023d879d40fc31f3edf0a6
> 
> which sound likely to fix such kind of problem, but it seems that we do not
> hit this, the bug is still here.
> 
> 


  parent reply	other threads:[~2008-12-15 15:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
2008-12-11 15:28 ` Vlad Yasevich
2008-12-12 13:04 ` Karsten Keil
2008-12-15 15:38 ` Vlad Yasevich [this message]
2008-12-15 17:02 ` Karsten Keil
2008-12-15 17:41 ` Vlad Yasevich
2008-12-15 17:42 ` Vlad Yasevich
2008-12-18 12:35 ` Karsten Keil
2008-12-18 17:30 ` Karsten Keil
2008-12-18 18:03 ` Vlad Yasevich
2008-12-18 23:01 ` Vlad Yasevich
2008-12-23 19:23 ` Vlad Yasevich
2009-01-05 23:05 ` Vlad Yasevich
2009-01-06 13:30 ` Michal Hocko
2009-01-06 13:50 ` Vlad Yasevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49467A10.8030802@hp.com \
    --to=vladislav.yasevich@hp.com \
    --cc=linux-sctp@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.