* Re: infinite spin in RT when booting with DHCP on
[not found] ` <1328534873.5882.227.camel@gandalf.stny.rr.com>
@ 2012-02-08 20:41 ` Thomas Gleixner
0 siblings, 0 replies; only message in thread
From: Thomas Gleixner @ 2012-02-08 20:41 UTC (permalink / raw)
To: Steven Rostedt
Cc: Hector Palacios, Uwe Kleine-König, Tim Sander,
linux-rt-users@vger.kernel.org, lclaudio@uudg.org, efault@gmx.de,
netdev, Shawn Guo
On Mon, 6 Feb 2012, Steven Rostedt wrote:
> On Mon, 2012-02-06 at 09:51 +0100, Hector Palacios wrote:
> > On 02/03/2012 06:39 PM, Steven Rostedt wrote:
> > > Note that you see that this causes a hang in the system if ksoftirqd is
> > > a real time task.
> >
> > This is true.
> >
> > > Not to mention, that ksoftirqd spins in an infinite
> > > loop if the cable isn't connected (regardless of ksoftirqd's priority).
> >
> > This is not true. The infinite loop is only hit when ksoftirqd is a real time task. I
> > think you got confused by the different patches we tried. That dirty hack of yours
> > with the workqueue was the one hanging with the cable disconnected. ;o)
> >
>
> I didn't say it was going to hang the box, I said it was going to spin.
>
> With the cable disconnected, did you run top to see if ksoftirqd was
> running at near 100%? It wont lock up the box because ksoftirqd is not
> a real time task in mainline.
NETDEV_TX_BUSY has always been a source of trouble and we carry a
bunch of patches in RT which handle the obvious candidates since we
encountered the first spinning lockup on RT.
Mainline does not notice as it falls back to the SCHED_OTHER softirq
thread after trying to reschedule the same thing over and
over.
NETDEV_TX_BUSY simply should die. It's a bad design decision (invented
for mitigation of SMP lock contention problems) and it's abuse by
driver writers to bridge the gap of hardware bringup is just a
consequence of that decision.
if (!fep->link) {
/* Link is down or autonegotiation is in progress. */
return NETDEV_TX_BUSY;
}
So instead of handling link down and autonegotiation gracefully this
code relies on the fact that a 2 seconds spinning loop goes unnoticed
in mainline because ksoftirqd runs with SCHED_OTHER.
Oh well,
tglx
^ permalink raw reply [flat|nested] only message in thread