netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
       [not found] <bug-10473-10286@http.bugzilla.kernel.org/>
@ 2008-04-18  0:12 ` Andrew Morton
  2008-04-18 14:06   ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2008-04-18  0:12 UTC (permalink / raw)
  To: netdev, Michael Buesch; +Cc: bugme-daemon, naoliv


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 17 Apr 2008 17:08:27 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10473
> 
>            Summary: Infinite loop "b44: eth0: powering down PHY"
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.25
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Network
>         AssignedTo: jgarzik@pobox.com
>         ReportedBy: naoliv@gmail.com
> 
> 
> Latest working kernel version: 2.6.24.4 (from Debian)
> Earliest failing kernel version: 2.6.25 (vanilla)
> Distribution: Debian
> Problem Description:
> 
> While booting the new 2.6.25 Kernel, it enters an infinite looping displaying
> "b44: eth0: powering down PHY".
> The system isn't freezed as magick SysRq keys works, but it just stay
> displaying those messages. I am unable to dump any information using SysRq,
> however (as the b44(...) messages are too fast).
> 
> I will attach lspci output and my .config
> 

Apparently a regression.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18  0:12 ` [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY" Andrew Morton
@ 2008-04-18 14:06   ` Michael Buesch
  2008-04-18 15:23     ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 14:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, bugme-daemon, naoliv, Gary Zambrano

CCed Gary (the b44 maintainer).
Not sure why I am actually CCed :)


On Friday 18 April 2008 02:12:15 Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Thu, 17 Apr 2008 17:08:27 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=10473
> > 
> >            Summary: Infinite loop "b44: eth0: powering down PHY"
> >            Product: Drivers
> >            Version: 2.5
> >      KernelVersion: 2.6.25
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Network
> >         AssignedTo: jgarzik@pobox.com
> >         ReportedBy: naoliv@gmail.com
> > 
> > 
> > Latest working kernel version: 2.6.24.4 (from Debian)
> > Earliest failing kernel version: 2.6.25 (vanilla)
> > Distribution: Debian
> > Problem Description:
> > 
> > While booting the new 2.6.25 Kernel, it enters an infinite looping displaying
> > "b44: eth0: powering down PHY".
> > The system isn't freezed as magick SysRq keys works, but it just stay
> > displaying those messages. I am unable to dump any information using SysRq,
> > however (as the b44(...) messages are too fast).
> > 
> > I will attach lspci output and my .config
> > 
> 
> Apparently a regression.

Can you add a dump_stack() call to the b44_halt() function and post the resulting logs?

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 14:06   ` Michael Buesch
@ 2008-04-18 15:23     ` Nelson A. de Oliveira
  2008-04-18 15:32       ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 15:23 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

On Fri, Apr 18, 2008 at 11:06 AM, Michael Buesch <mb@bu3sch.de> wrote:
>  > > While booting the new 2.6.25 Kernel, it enters an infinite looping displaying
>  > > "b44: eth0: powering down PHY".
>  > > The system isn't freezed as magick SysRq keys works, but it just stay
>  > > displaying those messages. I am unable to dump any information using SysRq,
>  > > however (as the b44(...) messages are too fast).
>  > >
>  > > I will attach lspci output and my .config
>  > >
>  >
>  > Apparently a regression.
>
>  Can you add a dump_stack() call to the b44_halt() function and post the resulting logs?

What I get is:

Pid: 4, comm: ksoftirqd/0 Tainted: GF 2.6.25-naoliv1 #2
[<f8992420>] [<b0231ffd>] [<b02380b9>] [<b011ed60>] [<b01060f2>]
[<b011f059>] [<b011efe1>] [<b01297c8>] [<b0129790>] [<b010473f>]

Something is saying to me that this won't help too much and that
probably I need to enable something related with debug (what do I need
to enable, please?)

BTW, is it a way to "pause" the messages after dump_stack()?

Thank you!

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 15:23     ` Nelson A. de Oliveira
@ 2008-04-18 15:32       ` Michael Buesch
  2008-04-18 17:12         ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 15:32 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Friday 18 April 2008 17:23:06 Nelson A. de Oliveira wrote:
> Hi!
> 
> On Fri, Apr 18, 2008 at 11:06 AM, Michael Buesch <mb@bu3sch.de> wrote:
> >  > > While booting the new 2.6.25 Kernel, it enters an infinite looping displaying
> >  > > "b44: eth0: powering down PHY".
> >  > > The system isn't freezed as magick SysRq keys works, but it just stay
> >  > > displaying those messages. I am unable to dump any information using SysRq,
> >  > > however (as the b44(...) messages are too fast).
> >  > >
> >  > > I will attach lspci output and my .config
> >  > >
> >  >
> >  > Apparently a regression.
> >
> >  Can you add a dump_stack() call to the b44_halt() function and post the resulting logs?
> 
> What I get is:
> 
> Pid: 4, comm: ksoftirqd/0 Tainted: GF 2.6.25-naoliv1 #2
> [<f8992420>] [<b0231ffd>] [<b02380b9>] [<b011ed60>] [<b01060f2>]
> [<b011f059>] [<b011efe1>] [<b01297c8>] [<b0129790>] [<b010473f>]

Ehm, please enable CONFIG_KALLSYMS.

> BTW, is it a way to "pause" the messages after dump_stack()?

mdelay(1000) will delay one second. But it will kill the system, basically.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 15:32       ` Michael Buesch
@ 2008-04-18 17:12         ` Nelson A. de Oliveira
  2008-04-18 17:19           ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 17:12 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

On Fri, Apr 18, 2008 at 12:32 PM, Michael Buesch <mb@bu3sch.de> wrote:
>  Ehm, please enable CONFIG_KALLSYMS.

Right. Sorry.

Here it is:

Pid: 4, comm: ksoftirqd/0 Tainted: GF 2.6.25-naoliv #4

[<f899df84>] b44_halt+0x68/0x7f [b44]

[<f899f432>] b44_poll+0x36a/0x405 [b44]

[<b02393ad>] net_rx_action+0x63/0x131

[<b011ee60>] __do_softirq+0x5a/0xa5

[<b01061e2>] do_softirq+0x52/0x84

[<b011f159>] ksoftirqd+0x78/0x110

[<b011f0e1>] ksoftirqd+0x0/0x110

[<b01298d4>] kthread+0x38/0x60

[<b012989c>] kthread+0x0/0x60

[<b010474b>] kernel_thread_helper+0x7/0x10

Anything else that I can do to help, please?

Thank you!

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 17:12         ` Nelson A. de Oliveira
@ 2008-04-18 17:19           ` Michael Buesch
  2008-04-18 17:43             ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 17:19 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Friday 18 April 2008 19:12:34 Nelson A. de Oliveira wrote:
> Anything else that I can do to help, please?

Please apply this patch and send me the messages.

Index: wireless-testing/drivers/net/b44.c
===================================================================
--- wireless-testing.orig/drivers/net/b44.c	2008-04-15 12:40:17.000000000 +0200
+++ wireless-testing/drivers/net/b44.c	2008-04-18 19:18:02.000000000 +0200
@@ -866,6 +866,7 @@ static int b44_poll(struct napi_struct *
 	if (bp->istat & ISTAT_ERRORS) {
 		unsigned long flags;
 
+printk(KERN_ERR "b44_poll: istat = 0x%08X\n", bp->istat);
 		spin_lock_irqsave(&bp->lock, flags);
 		b44_halt(bp);
 		b44_init_rings(bp);


-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 17:19           ` Michael Buesch
@ 2008-04-18 17:43             ` Nelson A. de Oliveira
  2008-04-18 17:59               ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 17:43 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

On Fri, Apr 18, 2008 at 2:19 PM, Michael Buesch <mb@bu3sch.de> wrote:
> On Friday 18 April 2008 19:12:34 Nelson A. de Oliveira wrote:
>  > Anything else that I can do to help, please?
>
>  Please apply this patch and send me the messages.

b44_poll: istat = 0x00000400

b44: eth0: powering down PHY


Pid: 0, comm: swapper Not tainted 2.6.25-naoliv1 #4

[<f899df84>] b44_halt+0x68/0x7f [b44]

[<f88f4440>] b44_poll+0x378/0x415 [b44]

[<b010453b>] common_interrupt+0x23/0x28

[<b02393ad>] net_rx_action+0x63/0x131

[<b011ee60>] __do_softirq+0x5a/0xa5

[<b01061e2>] do_softirq+0x52/0x84

[<b013d666>] handle_fasteoi_irq+0x0/0xad

[<b011ed3e>] irq_exit+0x35/0x76

[<b01062ad>] do_IRQ+0x99/0xb0

[<b010453b>] common_interrupt+0x23/0x28

[<f886700b>] acpi_idle_enter_bm+0x28c/0x2fd6 [processor]

[<b022602b>] cpuidle_idle_call+0x55/0x86

[<b0225fd6>] cpuidle_idle_call+0x0/0x86

[<b01028d3>] cpu_idle+0x8c/0xbc

I have increased the delay now.  This is the first message that
appears. It seems that after some time it starts to display the other
lines from my last email (Pid 4, comm: ksoftirqd/0 ...).

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 17:43             ` Nelson A. de Oliveira
@ 2008-04-18 17:59               ` Michael Buesch
  2008-04-18 18:09                 ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 17:59 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Friday 18 April 2008 19:43:57 Nelson A. de Oliveira wrote:
> Hi!
> 
> On Fri, Apr 18, 2008 at 2:19 PM, Michael Buesch <mb@bu3sch.de> wrote:
> > On Friday 18 April 2008 19:12:34 Nelson A. de Oliveira wrote:
> >  > Anything else that I can do to help, please?
> >
> >  Please apply this patch and send me the messages.
> 
> b44_poll: istat = 0x00000400

Hm, a descriptor error. Smells like my DMA fix actually broke this, damit.
On which architecture are you running?

> I have increased the delay now.  This is the first message that
> appears. It seems that after some time it starts to display the other
> lines from my last email (Pid 4, comm: ksoftirqd/0 ...).

I'm always only interested in the first message of one type :)

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 17:59               ` Michael Buesch
@ 2008-04-18 18:09                 ` Nelson A. de Oliveira
  2008-04-18 18:18                   ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 18:09 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Fri, Apr 18, 2008 at 2:59 PM, Michael Buesch <mb@bu3sch.de> wrote:
>  > b44_poll: istat = 0x00000400
>
>  Hm, a descriptor error. Smells like my DMA fix actually broke this, damit.
>  On which architecture are you running?

i386 here.

>  > I have increased the delay now.  This is the first message that
>  > appears. It seems that after some time it starts to display the other
>  > lines from my last email (Pid 4, comm: ksoftirqd/0 ...).
>
>  I'm always only interested in the first message of one type :)

Right :-)

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 18:09                 ` Nelson A. de Oliveira
@ 2008-04-18 18:18                   ` Michael Buesch
  2008-04-18 19:02                     ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 18:18 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Friday 18 April 2008 20:09:36 Nelson A. de Oliveira wrote:
> On Fri, Apr 18, 2008 at 2:59 PM, Michael Buesch <mb@bu3sch.de> wrote:
> >  > b44_poll: istat = 0x00000400
> >
> >  Hm, a descriptor error. Smells like my DMA fix actually broke this, damit.
> >  On which architecture are you running?
> 
> i386 here.

Hm, I tested my patch on i386.
So I'm not sure what's going on, actually. And the patch was pretty
trivial and I really can't find a bug in it.
So you say 2.6.24 was still working?

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 18:18                   ` Michael Buesch
@ 2008-04-18 19:02                     ` Nelson A. de Oliveira
  2008-04-18 19:19                       ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 19:02 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

On Fri, Apr 18, 2008 at 3:18 PM, Michael Buesch <mb@bu3sch.de> wrote:
> On Friday 18 April 2008 20:09:36 Nelson A. de Oliveira wrote:
>  > On Fri, Apr 18, 2008 at 2:59 PM, Michael Buesch <mb@bu3sch.de> wrote:
>  > >  > b44_poll: istat = 0x00000400
>  > >
>  > >  Hm, a descriptor error. Smells like my DMA fix actually broke this, damit.
>  > >  On which architecture are you running?
>  >
>  > i386 here.
>
>  Hm, I tested my patch on i386.
>  So I'm not sure what's going on, actually. And the patch was pretty
>  trivial and I really can't find a bug in it.
>  So you say 2.6.24 was still working?

Strange... compiled 2.6.24.4, 2.6.24 and 2.6.23 here and they are all
stopping with this:

b44: eth0: Link is up at 100 Mbps, full duplex.
b44: eth0: Flow control is off for TX and off for RX.

And it seems to keep waiting for something. The system isn't freezed
(as CTRL+ALT+DEL kills the running processes and correctly reboots the
machine).

With Debian's 2.6.24.4 it is working.
With vanilla 2.6.25 and my config it just enters an infinite loop of
"b44: eth0: powering down PHY".
Can different GCC versions cause this? Can a bad .config file cause
things like that? (I am using this .config for a long time and it has
always been working correctly, at least until now)

Thank you!

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 19:02                     ` Nelson A. de Oliveira
@ 2008-04-18 19:19                       ` Michael Buesch
  2008-04-18 20:38                         ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-18 19:19 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Friday 18 April 2008 21:02:37 Nelson A. de Oliveira wrote:
> Hi!
> 
> On Fri, Apr 18, 2008 at 3:18 PM, Michael Buesch <mb@bu3sch.de> wrote:
> > On Friday 18 April 2008 20:09:36 Nelson A. de Oliveira wrote:
> >  > On Fri, Apr 18, 2008 at 2:59 PM, Michael Buesch <mb@bu3sch.de> wrote:
> >  > >  > b44_poll: istat = 0x00000400
> >  > >
> >  > >  Hm, a descriptor error. Smells like my DMA fix actually broke this, damit.
> >  > >  On which architecture are you running?
> >  >
> >  > i386 here.
> >
> >  Hm, I tested my patch on i386.
> >  So I'm not sure what's going on, actually. And the patch was pretty
> >  trivial and I really can't find a bug in it.
> >  So you say 2.6.24 was still working?
> 
> Strange... compiled 2.6.24.4, 2.6.24 and 2.6.23 here and they are all
> stopping with this:
> 
> b44: eth0: Link is up at 100 Mbps, full duplex.
> b44: eth0: Flow control is off for TX and off for RX.
>
> And it seems to keep waiting for something. The system isn't freezed
> (as CTRL+ALT+DEL kills the running processes and correctly reboots the
> machine).

Well. 2.6.24 didn't have this message. But it could still have the actual
bug, of course. So can you try applying my printk patch to a broken 2.6.24
kernel and see whether it triggers the message or not? Under normal
circumstances this codepath should never trigger.

> With Debian's 2.6.24.4 it is working.
> With vanilla 2.6.25 and my config it just enters an infinite loop of
> "b44: eth0: powering down PHY".

This message was added in 2.6.25. That doesn't mean the
bug was also added in 2.6.25, of course.

> Can different GCC versions cause this? Can a bad .config file cause
> things like that? (I am using this .config for a long time and it has
> always been working correctly, at least until now)

Well, possible, although unlikely.

Can you try bisecting the bug? Yeah, I know about the lwn article [1] that
says bisecting is baaaaaad (tm), but my opinion is different. :)
It's an excellent tool for efficiently finding patches that caused bugs.
But take care to really check whether device _works_ or not. Just looking
at the actual "powering down PHY" will _not_ be enough, as that was only
recently added, as I said.

[1] http://lwn.net/Articles/278137/

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 19:19                       ` Michael Buesch
@ 2008-04-18 20:38                         ` Nelson A. de Oliveira
  2008-04-21 18:14                           ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-18 20:38 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

On Fri, Apr 18, 2008 at 4:19 PM, Michael Buesch <mb@bu3sch.de> wrote:
>  Well. 2.6.24 didn't have this message. But it could still have the actual
>  bug, of course. So can you try applying my printk patch to a broken 2.6.24
>  kernel and see whether it triggers the message or not? Under normal
>  circumstances this codepath should never trigger.

No b44_poll message printed when using your patch on 2.6.24, 2.6.23 and 2.6.21.

>  Can you try bisecting the bug? Yeah, I know about the lwn article [1] that
>  says bisecting is baaaaaad (tm), but my opinion is different. :)
>  It's an excellent tool for efficiently finding patches that caused bugs.
>  But take care to really check whether device _works_ or not. Just looking
>  at the actual "powering down PHY" will _not_ be enough, as that was only
>  recently added, as I said.

Sure. I will do this when I arrive at home (Can you point me to some
URL to read and do the bisections, please?).
What I saw with 2.6.24, 2.6.23 and 2.6.21 is that the interface seems
to be up, getting an IP via DHCP (I can ping from another machine),
but it stays waiting for something after printing

b44: eth0: Link is up at 100 Mbps, full duplex.
b44: eth0: Flow control is off for TX and off for RX.

Thank you!

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-18 20:38                         ` Nelson A. de Oliveira
@ 2008-04-21 18:14                           ` Nelson A. de Oliveira
  2008-04-21 18:21                             ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-21 18:14 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

Hi!

I have tried to do a bisect here (thank you Jike Song for the link).
Marked 2.6.20 as good and master as bad. On the first test, I've got this:

(...)
BUG: unable to handle kernel NULL pointer dereference at virtual
address 00000000
 printing eip:
b01b6265
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: b44(F) mousedev(F) iwl3945(F) ehci_hcd(F)
mac80211(F) snd_hda_intel(F) thermal(F) i2c_i801(F) ac(F) ssb(F)
snd_pcm(F) snd_timer(F) uhci_hcd(F) psmouse(F) evdev(F) battery(F)
button(F) processor(F) mii(F) usbcore(F) snd(F) snd_page_alloc(F)
sg(F) sr_mod(F) cdrom(F)
CPU:    0
EIP:    0060:[<b01b6265>]    Tainted: GF       VLI
EFLAGS: 00010246   (2.6.23-naoliv1 #1)
EIP is at strlen+0x8/0x11
eax: 00000000   ebx: f7429000   ecx: ffffffff   edx: f76b6cb0
esi: 00000000   edi: 00000000   ebp: 00000000   esp: f76b6ca0
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process modprobe (pid: 692, ti=f76b6000 task=f76c6000 task.ti=f76b6000)
Stack: f75a2000 b01b3254 f785f200 b02d3e5b b02cb0da f7856200 b01b324a f7426688
       b02cb0da f7856200 f88f61f0 b0310c8c f785f200 f88f8e28 b0207d4e f74266a8
       f88f8d9c f7426600 f7426600 00000000 f7453400 b0206d8b f7426688 b02c4e14
Call Trace:
 [<b01b3254>] kobject_uevent_env+0x276/0x383
 [<b01b324a>] kobject_uevent_env+0x26c/0x383
 [<b0207d4e>] bus_add_device+0xad/0xdc
 [<b0206d8b>] device_add+0x2a0/0x45e
 [<f88f36f1>] ssb_attach_queued_buses+0x1a2/0x297 [ssb]
 [<f88f3b2f>] ssb_bus_register+0x120/0x185 [ssb]
 [<f88f4ac2>] ssb_pci_get_invariants+0x0/0x281 [ssb]
 [<f88f3bf3>] ssb_bus_pcibus_register+0x24/0x47 [ssb]
 [<b01bb856>] pci_set_master+0x54/0x58
 [<f88f52b1>] ssb_pcihost_probe+0x5e/0x89 [ssb]
 [<b01bd0ff>] pci_device_probe+0x36/0x55
 [<b020857e>] driver_probe_device+0xc5/0x148
 [<b02890a5>] klist_next+0x58/0x6d
 [<b02086dc>] __driver_attach+0x49/0x7f
 [<b0207ba8>] bus_for_each_dev+0x35/0x57
 [<b02083f2>] driver_attach+0x16/0x18
 [<b0208693>] __driver_attach+0x0/0x7f
 [<b0207e56>] bus_add_driver+0x6d/0x17d
 [<b01bd249>] __pci_register_driver+0x55/0x81
 [<f881d01f>] b44_init+0x1f/0x48 [b44]
 [<b013cdcc>] sys_init_module+0x1545/0x1619
 [<b0103e9a>] sysenter_past_esp+0x5f/0x85
 =======================
Code: f0 48 5e c3 56 89 d1 89 c6 83 ec 04 31 d2 89 c8 88 c4 ac 38 e0
75 03 8d 56 ff 84 c0 75 f4 5e 89 d0 5e c3 57 83 c9 ff 89 c7 31 c0 <f2>
ae f7 d1 49 5f 89 c8 c3 57 89 c7 89 d0 31 d2 85 c9 74 0c f2
EIP: [<b01b6265>] strlen+0x8/0x11 SS:ESP 0068:f76b6ca0
hub 1-2:1.0: hub_port_status failed (err = -71)
hub 1-2:1.0: hub_port_status failed (err = -71)
hub 1-2:1.0: hub_port_status failed (err = -71)
hub 1-2:1.0: hub_port_status failed (err = -71)
Clocksource tsc unstable (delta = -162081422 ns)
usb 5-2: new high speed USB device using ehci_hcd and address 2
usb 5-2: configuration #1 chosen from 1 choice
hub 5-2:1.0: USB hub found
hub 5-2:1.0: 4 ports detected
sysfs: duplicate filename 'bInterfaceNumber' can not be created
WARNING: at fs/sysfs/dir.c:425 sysfs_add_one()
 [<b018bebc>] sysfs_add_one+0x54/0xb8
 [<b018ba00>] sysfs_add_file+0x42/0x6a
 [<b018d115>] sysfs_create_group+0x84/0xe7
 [<b0206f3f>] device_add+0x454/0x45e
 [<f88ca72a>] usb_create_sysfs_intf_files+0x24/0x98 [usbcore]
 [<f88c7295>] usb_set_configuration+0x48f/0x4a9 [usbcore]
 [<f88cdcdb>] generic_probe+0x50/0x91 [usbcore]
 [<f88c8784>] usb_probe_device+0x32/0x37 [usbcore]
 [<b020857e>] driver_probe_device+0xc5/0x148
 [<b02890a5>] klist_next+0x58/0x6d
 [<b0207aa8>] bus_for_each_drv+0x35/0x5c
 [<b020867f>] device_attach+0x5e/0x72
 [<b0208601>] __device_attach+0x0/0x5
 [<b0207a24>] bus_attach_device+0x26/0x75
 [<b0206d92>] device_add+0x2a7/0x45e
 [<f88c2c1a>] usb_new_device+0x4d/0x8a [usbcore]
 [<f88c3746>] hub_thread+0x702/0xa8f [usbcore]
 [<b012fd84>] autoremove_wake_function+0x0/0x33
 [<f88c3044>] hub_thread+0x0/0xa8f [usbcore]
 [<b012fcb7>] kthread+0x38/0x5d
 [<b012fc7f>] kthread+0x0/0x5d
 [<b0104abb>] kernel_thread_helper+0x7/0x10
 =======================
(...)

This one is probably 2.6.23.
After some time the system continued to boot, but without network interface.
So marked it as bad.
The newer bisect failed to compile. Marked it bad. Bisect again,
failed, again, failed :-(
My git-bisect log is:

git-bisect start
# good: [62d0cfcb27cf755cebdc93ca95dabc83608007cd] Linux 2.6.20
git-bisect good 62d0cfcb27cf755cebdc93ca95dabc83608007cd
# bad: [3925e6fc1f774048404fdd910b0345b06c699eb4] Merge branch
'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
git-bisect bad 3925e6fc1f774048404fdd910b0345b06c699eb4
# bad: [3749c66c67fb5c257771815c186bc32290cacf44] Merge branch
'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect bad 3749c66c67fb5c257771815c186bc32290cacf44
# bad: [b11115c15351faba978ce1b9e75068e77f6ef48d] serial_core.h:
include <linux/sysrq.h>
git-bisect bad b11115c15351faba978ce1b9e75068e77f6ef48d
# bad: [1936502d00ae6c2aa3931c42f6cf54afaba094f2] [NET_SCHED] qdisc:
avoid transmit softirq on watchdog wakeup
git-bisect bad 1936502d00ae6c2aa3931c42f6cf54afaba094f2

What else can I do, please?

Thank you very much!

Best regards,
Nelson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-21 18:14                           ` Nelson A. de Oliveira
@ 2008-04-21 18:21                             ` Michael Buesch
  2008-04-22  3:01                               ` Nelson A. de Oliveira
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Buesch @ 2008-04-21 18:21 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Monday 21 April 2008 20:14:44 Nelson A. de Oliveira wrote:
> This one is probably 2.6.23.
> After some time the system continued to boot, but without network interface.
> So marked it as bad.

That probably was a mistake

> What else can I do, please?

You can try latest git. I was told it has a feature
to tell bisect "I don't know" instead of "good" or "bad".
This can be used if a test kernel doesn't compile, or does
fail because of some other bug.

You can also manually bisect the stuff between your known-good
version of b44 and the bad one. There were only a couple of patches.
You can extract them with git and revert them one by one and see
when it does start working again.
I think it was something like 5 patches or so. Nothing too time consuming.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-21 18:21                             ` Michael Buesch
@ 2008-04-22  3:01                               ` Nelson A. de Oliveira
  2008-04-22 13:34                                 ` Michael Buesch
  0 siblings, 1 reply; 17+ messages in thread
From: Nelson A. de Oliveira @ 2008-04-22  3:01 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

[-- Attachment #1: Type: text/plain, Size: 1266 bytes --]

Hi!

Maybe this can help:
Using a new .config, I started to enable/disable options and test.
What I found is that if I enable "3G/1G user/kernel split", the kernel
works (it boots normally, the network interface works, etc). If I
select "3G/1G user/kernel split (for full 1G low memory)" I get the
infinite loop of "b44: eth0: powering down PHY".

Working config file (on 2.6.25) is attached.
Diff to the non-working is below:

--- working_config	2008-04-21 23:42:40.000000000 -0300
+++ not_working_config	2008-04-21 23:55:28.000000000 -0300
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.25
-# Mon Apr 21 23:28:49 2008
+# Mon Apr 21 23:43:04 2008
 #
 # CONFIG_64BIT is not set
 CONFIG_X86_32=y
@@ -228,12 +228,12 @@
 # CONFIG_NOHIGHMEM is not set
 CONFIG_HIGHMEM4G=y
 # CONFIG_HIGHMEM64G is not set
-CONFIG_VMSPLIT_3G=y
-# CONFIG_VMSPLIT_3G_OPT is not set
+# CONFIG_VMSPLIT_3G is not set
+CONFIG_VMSPLIT_3G_OPT=y
 # CONFIG_VMSPLIT_2G is not set
 # CONFIG_VMSPLIT_2G_OPT is not set
 # CONFIG_VMSPLIT_1G is not set
-CONFIG_PAGE_OFFSET=0xC0000000
+CONFIG_PAGE_OFFSET=0xB0000000
 CONFIG_HIGHMEM=y
 CONFIG_ARCH_FLATMEM_ENABLE=y
 CONFIG_ARCH_SPARSEMEM_ENABLE=y

Can this be the cause?

Thank you!

Best regards,
Nelson

[-- Attachment #2: working_config.gz --]
[-- Type: application/x-gzip, Size: 12496 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY"
  2008-04-22  3:01                               ` Nelson A. de Oliveira
@ 2008-04-22 13:34                                 ` Michael Buesch
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Buesch @ 2008-04-22 13:34 UTC (permalink / raw)
  To: Nelson A. de Oliveira; +Cc: Andrew Morton, netdev, bugme-daemon, Gary Zambrano

On Tuesday 22 April 2008 05:01:54 Nelson A. de Oliveira wrote:
> Hi!
> 
> Maybe this can help:
> Using a new .config, I started to enable/disable options and test.
> What I found is that if I enable "3G/1G user/kernel split", the kernel
> works (it boots normally, the network interface works, etc). If I
> select "3G/1G user/kernel split (for full 1G low memory)" I get the
> infinite loop of "b44: eth0: powering down PHY".

Ah, so this bug isn't actually caused by a patch but rather by a
different config option.
I think we can't do much about it, currently. The device has strange
memory requirements and changing the split does actually break it.
This cannot be fixed until andi kleen's mask-allocator is merged.
This "bug" has always been there.

> Working config file (on 2.6.25) is attached.
> Diff to the non-working is below:
> 
> --- working_config	2008-04-21 23:42:40.000000000 -0300
> +++ not_working_config	2008-04-21 23:55:28.000000000 -0300
> @@ -1,7 +1,7 @@
>  #
>  # Automatically generated make config: don't edit
>  # Linux kernel version: 2.6.25
> -# Mon Apr 21 23:28:49 2008
> +# Mon Apr 21 23:43:04 2008
>  #
>  # CONFIG_64BIT is not set
>  CONFIG_X86_32=y
> @@ -228,12 +228,12 @@
>  # CONFIG_NOHIGHMEM is not set
>  CONFIG_HIGHMEM4G=y
>  # CONFIG_HIGHMEM64G is not set
> -CONFIG_VMSPLIT_3G=y
> -# CONFIG_VMSPLIT_3G_OPT is not set
> +# CONFIG_VMSPLIT_3G is not set
> +CONFIG_VMSPLIT_3G_OPT=y
>  # CONFIG_VMSPLIT_2G is not set
>  # CONFIG_VMSPLIT_2G_OPT is not set
>  # CONFIG_VMSPLIT_1G is not set
> -CONFIG_PAGE_OFFSET=0xC0000000
> +CONFIG_PAGE_OFFSET=0xB0000000
>  CONFIG_HIGHMEM=y
>  CONFIG_ARCH_FLATMEM_ENABLE=y
>  CONFIG_ARCH_SPARSEMEM_ENABLE=y
> 
> Can this be the cause?
> 
> Thank you!
> 
> Best regards,
> Nelson
> 



-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-04-22 13:35 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-10473-10286@http.bugzilla.kernel.org/>
2008-04-18  0:12 ` [Bugme-new] [Bug 10473] New: Infinite loop "b44: eth0: powering down PHY" Andrew Morton
2008-04-18 14:06   ` Michael Buesch
2008-04-18 15:23     ` Nelson A. de Oliveira
2008-04-18 15:32       ` Michael Buesch
2008-04-18 17:12         ` Nelson A. de Oliveira
2008-04-18 17:19           ` Michael Buesch
2008-04-18 17:43             ` Nelson A. de Oliveira
2008-04-18 17:59               ` Michael Buesch
2008-04-18 18:09                 ` Nelson A. de Oliveira
2008-04-18 18:18                   ` Michael Buesch
2008-04-18 19:02                     ` Nelson A. de Oliveira
2008-04-18 19:19                       ` Michael Buesch
2008-04-18 20:38                         ` Nelson A. de Oliveira
2008-04-21 18:14                           ` Nelson A. de Oliveira
2008-04-21 18:21                             ` Michael Buesch
2008-04-22  3:01                               ` Nelson A. de Oliveira
2008-04-22 13:34                                 ` Michael Buesch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).