All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank Schreuder <fschreuder@transip.nl>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
	Florian Westphal <fw@strlen.de>
Cc: Johan Schuijt <johan@transip.nl>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"nikolay@redhat.com" <nikolay@redhat.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"chutzpah@gentoo.org" <chutzpah@gentoo.org>,
	"Robin Geuze" <robing@transip.nl>,
	netdev <netdev@vger.kernel.org>
Subject: Re: reproducable panic eviction work queue
Date: Wed, 22 Jul 2015 17:31:27 +0200	[thread overview]
Message-ID: <55AFB74F.8050809@transip.nl> (raw)
In-Reply-To: <55AFA55D.4000606@cumulusnetworks.com>


Op 7/22/2015 om 4:14 PM schreef Nikolay Aleksandrov:
> On 07/22/2015 04:03 PM, Nikolay Aleksandrov wrote:
>> On 07/22/2015 03:58 PM, Florian Westphal wrote:
>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>> On 07/22/2015 10:17 AM, Frank Schreuder wrote:
>>>>> I got some additional information from syslog:
>>>>>
>>>>> Jul 22 09:49:33 dommy0 kernel: [  675.987890] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:42]
>>>>> Jul 22 09:49:42 dommy0 kernel: [  685.114033] INFO: rcu_sched self-detected stall on CPU { 3}  (t=39918 jiffies g=988 c=987 q=23168)
>>>>>
>>>>> Thanks,
>>>>> Frank
>>>>>
>>>>>
>>>> Hi,
>>>> It looks like it's happening because of the evict_again logic, I think we should also
>>>> add Florian's first suggestion about simplifying it to the patch and just skip the
>>>> entry if we can't delete its timer otherwise we can restart the eviction and see
>>>> entries that already had their timer stopped by us and can keep restarting for
>>>> a long time.
>>>> Here's an updated patch that removes the evict_again logic.
>>> Thanks Nik.  I'm afraid this adds bug when netns is exiting.
>>>
>>> Currently, we wait until timer has finished, but after the change
>>> we might destroy percpu counter while a timer is still executing on
>>> another cpu.
>>>
>>> I pushed a patch series to
>>> https://git.breakpoint.cc/cgit/fw/net.git/log/?h=inetfrag_fixes_02
>>>
>>> It includes this patch with a small change -- deferral of the percpu
>>> counter subtraction until after queue has been free'd.
>>>
>>> Frank -- it would be great if you could test with the four patches in
>>> that series applied.
>>>
>>> I'll then add your tested-by Tag to all of them before submitting this.
>>>
>>> Thanks again for all your help in getting this fixed!
>>>
>> Sure, I didn't think it through, just supplied it for the test. :-)
>> Thanks for fixing it up!
>>
> Patches look great, even the INET_FRAG_EVICTED flag will not be accidentally cleared
> this way. I'll give them a try.
>
>

Hi,

I'm currently building a new kernel bases on 3.18.19 + patches.
One of the patches however fails to apply as we dont have a 
"net/ieee802154/6lowpan/" directory.
Modifying the patch to use "net/ieee802154/reassembly.c" does work 
without problems.
Is this a due to the different kernel version or something else?

I'll come back to you as soon as I have my first test results.

Thanks,
Frank

      reply	other threads:[~2015-07-22 15:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <F8D94413-90A2-4F80-AAA2-7A6AB57DF314@transip.nl>
2015-07-18  8:56 ` reproducable panic eviction work queue Eric Dumazet
2015-07-18  9:01   ` Johan Schuijt
2015-07-18 10:02     ` Nikolay Aleksandrov
2015-07-18 13:31       ` Nikolay Aleksandrov
2015-07-18 15:28       ` Johan Schuijt
2015-07-18 15:30         ` Johan Schuijt
2015-07-18 15:32         ` Nikolay Aleksandrov
2015-07-20 12:47           ` Frank Schreuder
2015-07-20 14:02             ` Nikolay Aleksandrov
2015-07-20 14:30             ` Florian Westphal
2015-07-21 11:50               ` Frank Schreuder
2015-07-21 18:34                 ` Florian Westphal
2015-07-22  8:09                   ` Frank Schreuder
2015-07-22  8:17                     ` Frank Schreuder
2015-07-22  9:11                       ` Nikolay Aleksandrov
2015-07-22 10:55                         ` Frank Schreuder
2015-07-22 13:58                         ` Florian Westphal
2015-07-22 14:03                           ` Nikolay Aleksandrov
2015-07-22 14:14                             ` Nikolay Aleksandrov
2015-07-22 15:31                               ` Frank Schreuder [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55AFB74F.8050809@transip.nl \
    --to=fschreuder@transip.nl \
    --cc=chutzpah@gentoo.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=johan@transip.nl \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=nikolay@redhat.com \
    --cc=robing@transip.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.