From: Andres Freund <andres@anarazel.de>
To: Jarek Poplawski <jarkao2@gmail.com>
Cc: Joao Correia <joaomiguelcorreia@gmail.com>,
Arun R Bharadwaj <arun@linux.vnet.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Stephen Hemminger <shemminger@vyatta.com>,
netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)
Date: Mon, 6 Jul 2009 18:13:29 +0200 [thread overview]
Message-ID: <200907061813.29379.andres@anarazel.de> (raw)
In-Reply-To: <20090706141916.GA3477@ami.dom.local>
On Monday 06 July 2009 16:19:16 Jarek Poplawski wrote:
> On Mon, Jul 06, 2009 at 05:53:51AM +0100, Joao Correia wrote:
> > Hello
> >
> > System freezes immediatly after grub, no init processing at all, after
> > applying those patches on top of vanilla 2.6.30 on my box.
>
> ...
>
> > doesnt work on top of 2.6.30. It complains, while compiling, that
> > sysctl_timer_migration is not defined. So i just replaced that call
> > with return 1, like on the not debug case. Hope this doesnt defeat
> > your test case, but it wouldnt compile otherwise. Probably that was
> > just introduced after 2.6.30?
I stupidly sent two emails in private to Jarek. Reposting here:
Jarek:
> > > > > Yes, my bad, sorry. I've found 2 more patches from this series;
can't
> > > > > guarantee that's all, but seems to work & migrate within my one and
> > > > > only core without any problems ;-)
Andres:
> > > > I have some doubt that this will give us new information:
> > > > The commit i bisected the failure to:
> > > > eea08f32adb3f97553d49a4f79a119833036000a
> > > > Is just 2.6.30-rc4 + the four commits you listed...
Jarek:
> > > I guess, you mean 2.6.31-rc1?
Andres:
> > No - I tested the timer development branch to exclude its a problem caused
> > by some other change between 2.6.30 and 2.6.31-git
> > And that branch is based on rc4...
Jarek:
> I misunderstood, sorry! That's just what I needed to know!
Andres:
> > > > And I seperately tested eea08f32adb3f97553d49a4f79a119833036000a^ to
> > > > be sure. So I am pretty sure its those commits which trigger the
> > > > problem - whats causing it is another matter.
Jarek:
> > > It might be true, but it isn't 100% proof. This patchset is special:
> > > by moving timers to other cores it generates much more SMP concurrency,
> > > so it could trigger some hidden races, which otherwise need much more
> > > time to show up. So I'm trying to establish if this could be the case.
> > > Btw., I guess there is nothing to hide from the lists, plus somebody
> > > could verify this idea?
Andres:
> > No, absolutely not. Just hit the wrong key. Sorry.
> > Btw, I ran netem with delay for more than 48h on around 80mbit... That
> > does not exclude such a rarely triggered race, but makes it a bit more
> > unlikely. (With migration thats around 3sec or so)
> This is a very important information: it should give timers' guys some
> incentive to start looking for this, and me less incentive to verify
> network code ;-)
Jarek:
> Btw., there were some strange traces of lockdep and stack overruning;
> did you try if without lockdep maybe there are some more readable
> warnings?
Lockdep was not enabled at first. Actually I think most if not all of the
traces I posted at first were without.
Will verify.
> And once again, consider resending this to the public, please. (At
> least Joao might be interested.)
Sorry once more.
Andres
next prev parent reply other threads:[~2009-07-06 16:13 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-03 1:31 Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly caused by netem) Andres Freund
2009-07-03 6:12 ` Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused " Jarek Poplawski
2009-07-03 11:26 ` Andres Freund
2009-07-03 12:03 ` Jarek Poplawski
2009-07-03 12:30 ` Andres Freund
2009-07-03 20:22 ` David Miller
2009-07-03 22:56 ` Jarek Poplawski
2009-07-04 1:55 ` David Miller
2009-07-04 6:36 ` Jarek Poplawski
2009-07-04 15:18 ` Jarek Poplawski
2009-07-06 4:53 ` Joao Correia
2009-07-06 4:53 ` Joao Correia
2009-07-06 8:14 ` Jarek Poplawski
2009-07-06 11:28 ` Joao Correia
2009-07-06 11:28 ` Joao Correia
2009-07-06 14:19 ` Jarek Poplawski
2009-07-06 16:13 ` Andres Freund [this message]
2009-07-06 16:31 ` Jarek Poplawski
2009-07-06 17:23 ` Joao Correia
2009-07-06 17:23 ` Joao Correia
2009-07-06 17:26 ` Andres Freund
2009-07-07 6:50 ` Jarek Poplawski
2009-07-07 10:40 ` Joao Correia
2009-07-07 10:40 ` Joao Correia
2009-07-07 10:47 ` Andres Freund
[not found] ` <a5d9929e0907070403n698c9eb9p53a5bb07bafcc169@mail.gmail.com>
2009-07-07 11:05 ` Fwd: " Joao Correia
2009-07-07 11:05 ` Joao Correia
2009-07-07 13:18 ` Jarek Poplawski
2009-07-07 13:22 ` Andres Freund
2009-07-07 13:29 ` Jarek Poplawski
2009-07-07 13:34 ` Andres Freund
2009-07-07 13:57 ` Jarek Poplawski
2009-07-07 16:11 ` Andres Freund
2009-07-08 8:08 ` Jarek Poplawski
2009-07-08 8:29 ` Andres Freund
2009-07-08 9:13 ` Jarek Poplawski
2009-07-08 21:44 ` Joao Correia
2009-07-08 21:44 ` Joao Correia
2009-07-08 22:07 ` Jarek Poplawski
2009-07-08 22:27 ` Joao Correia
2009-07-08 22:27 ` Joao Correia
2009-07-08 22:42 ` Jarek Poplawski
2009-07-08 22:48 ` Joao Correia
2009-07-08 22:48 ` Joao Correia
2009-07-08 22:23 ` Andres Freund
2009-07-08 22:48 ` Jarek Poplawski
2009-07-09 10:31 ` Thomas Gleixner
2009-07-09 10:44 ` Jarek Poplawski
2009-07-09 12:03 ` Thomas Gleixner
2009-07-09 13:22 ` Jarek Poplawski
2009-07-09 14:15 ` Thomas Gleixner
2009-07-09 14:24 ` Jarek Poplawski
2009-07-09 14:25 ` Joao Correia
2009-07-09 14:25 ` Joao Correia
2009-07-09 14:28 ` Thomas Gleixner
2009-07-09 15:28 ` Andres Freund
2009-07-09 16:01 ` Thomas Gleixner
2009-07-09 16:46 ` Andres Freund
2009-07-09 17:44 ` Thomas Gleixner
2009-07-09 21:19 ` Joao Correia
2009-07-09 21:19 ` Joao Correia
2009-07-07 13:20 ` Jarek Poplawski
2009-07-06 17:24 ` Andres Freund
-- strict thread matches above, loose matches on Subject: below --
2009-06-30 23:20 Soft-Lockup/Race in networking in 2.6.31-rc1+195 (possibly caused " Andres Freund
2009-07-01 18:39 ` Jarek Poplawski
2009-07-01 21:22 ` Andres Freund
2009-07-02 0:37 ` Andres Freund
2009-07-02 9:30 ` Jarek Poplawski
2009-07-02 10:12 ` Jarek Poplawski
2009-07-02 10:51 ` Joao Correia
2009-07-02 10:51 ` Joao Correia
2009-07-02 11:09 ` Jarek Poplawski
2009-07-02 11:11 ` Andres Freund
2009-07-02 11:43 ` Jarek Poplawski
2009-07-02 11:43 ` Andres Freund
2009-07-02 11:54 ` Jarek Poplawski
2009-07-02 11:59 ` Andres Freund
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200907061813.29379.andres@anarazel.de \
--to=andres@anarazel.de \
--cc=arun@linux.vnet.ibm.com \
--cc=jarkao2@gmail.com \
--cc=joaomiguelcorreia@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.