netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
       [not found] <20061021132239.GA29288@gamma.logic.tuwien.ac.at>
@ 2006-10-21 17:02 ` Andrew Morton
  2006-10-21 17:18   ` Rafael J. Wysocki
  2006-10-21 23:41   ` Norbert Preining
       [not found] ` <20061021.123814.106436476.davem@davemloft.net>
  1 sibling, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2006-10-21 17:02 UTC (permalink / raw)
  To: Norbert Preining; +Cc: linux-kernel, netdev, Michael Chan


cc's added.

On Sat, 21 Oct 2006 15:22:39 +0200
Norbert Preining <preining@logic.at> wrote:

> Hi all!
> 
> I get the same bug again and again, always when ifplugd is started:
> 
> tg3: eth0: No firmware running.
> BUG: soft lockup detected on CPU#0!
>  [<c0103ec7>] dump_trace+0x68/0x1b4
>  [<c010402b>] show_trace_log_lvl+0x18/0x2c
>  [<c010463a>] show_trace+0xf/0x11
>  [<c010469d>] dump_stack+0x12/0x14
>  [<c0141cb3>] softlockup_tick+0xaa/0xc1
>  [<c0129bad>] update_process_times+0x3b/0x5e
>  [<c01362a1>] handle_update_profile+0x14/0x1e
>  [<c0115956>] smp_apic_timer_interrupt+0x49/0x5b
>  [<c0103998>] apic_timer_interrupt+0x28/0x30
> DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
> Leftover inexact backtrace:
>  [<c01d03af>] delay_tsc+0xb/0x13
>  [<c01d03e0>] __delay+0x6/0x7
>  [<c022ce12>] tg3_readphy+0x6e/0xd5
>  [<c022e0d1>] tg3_setup_copper_phy+0x30b/0xa15
>  [<c01064d9>] profile_pc+0x24/0x53
>  [<c022f475>] tg3_setup_phy+0xc9a/0xd1f
>  [<c0103998>] apic_timer_interrupt+0x28/0x30
>  [<c022c240>] _tw32_flush+0x3f/0x51
>  [<c022dc4a>] tg3_write_mem+0xcf/0xe7
>  [<c0231683>] tg3_reset_hw+0x10ab/0x13a0
>  [<c01d03e0>] __delay+0x6/0x7
>  [<c01d03e0>] __delay+0x6/0x7
>  [<c022c240>] _tw32_flush+0x3f/0x51
>  [<c01d03e0>] __delay+0x6/0x7
>  [<c022da97>] tg3_switch_clocks+0x8f/0x93
>  [<c0237673>] tg3_open+0x250/0x520
>  [<c02d3263>] dev_open+0x2b/0x62
>  [<c02d1dd8>] dev_change_flags+0x47/0xe4
>  [<c0307fcc>] devinet_ioctl+0x252/0x556
>  [<c02d2e5a>] dev_ifsioc+0x113/0x38d
>  [<c02d29c4>] dev_load+0x24/0x4b
>  [<c02c90c7>] sock_ioctl+0x0/0x1c2
>  [<c02c9265>] sock_ioctl+0x19e/0x1c2
>  [<c02ca151>] sock_map_fd+0x41/0x4a
>  [<c02c90c7>] sock_ioctl+0x0/0x1c2
>  [<c01684bb>] do_ioctl+0x1f/0x62
>  [<c0168743>] vfs_ioctl+0x245/0x257
>  [<c0168788>] sys_ioctl+0x33/0x4b
>  [<c0102f40>] syscall_call+0x7/0xb
>  =======================
> 
> With  2.6.19-rc2 (no -mm) it does not happen.
> 
> Normal dmesg gives:
> tg3.c:v3.66 (September 23, 2006)
> PCI: Enabling device 0000:03:00.0 (0000 -> 0002)
> ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 17 (level, low) -> IRQ 18
> PCI: Setting latency timer of device 0000:03:00.0 to 64
> eth0: Tigon3 [partno(BCM95789) rev 4101 PHY(5750)] (PCI Express) 10/100/1000Base
> T Ethernet 00:16:36:1e:27:ad
> eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
> eth0: dma_rwctrl[76180000] dma_mask[64-bit]
> 


There are tg3 changes in -mm, but I doubt it they caused this hang.

Can you test 2.6.19-rc2 plus the below?

Thanks.

--- linux-2.6.19-rc2/drivers/net/tg3.c	2006-10-13 10:35:00.000000000 -0700
+++ devel/drivers/net/tg3.c	2006-10-21 09:34:42.000000000 -0700
@@ -68,8 +68,8 @@
 
 #define DRV_MODULE_NAME		"tg3"
 #define PFX DRV_MODULE_NAME	": "
-#define DRV_MODULE_VERSION	"3.66"
-#define DRV_MODULE_RELDATE	"September 23, 2006"
+#define DRV_MODULE_VERSION	"3.67"
+#define DRV_MODULE_RELDATE	"October 18, 2006"
 
 #define TG3_DEF_MAC_MODE	0
 #define TG3_DEF_RX_MODE		0
@@ -129,7 +129,7 @@
 #define RX_JUMBO_PKT_BUF_SZ	(9046 + tp->rx_offset + 64)
 
 /* minimum number of free TX descriptors required to wake up TX process */
-#define TG3_TX_WAKEUP_THRESH		(TG3_TX_RING_SIZE / 4)
+#define TG3_TX_WAKEUP_THRESH(tp)		((tp)->tx_pending / 4)
 
 /* number of ETHTOOL_GSTATS u64's */
 #define TG3_NUM_STATS		(sizeof(struct tg3_ethtool_stats)/sizeof(u64))
@@ -3075,10 +3075,10 @@ static void tg3_tx(struct tg3 *tp)
 	smp_mb();
 
 	if (unlikely(netif_queue_stopped(tp->dev) &&
-		     (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH))) {
+		     (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp)))) {
 		netif_tx_lock(tp->dev);
 		if (netif_queue_stopped(tp->dev) &&
-		    (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH))
+		    (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp)))
 			netif_wake_queue(tp->dev);
 		netif_tx_unlock(tp->dev);
 	}
@@ -3928,7 +3928,7 @@ static int tg3_start_xmit(struct sk_buff
 	tp->tx_prod = entry;
 	if (unlikely(tg3_tx_avail(tp) <= (MAX_SKB_FRAGS + 1))) {
 		netif_stop_queue(dev);
-		if (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH)
+		if (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp))
 			netif_wake_queue(tp->dev);
 	}
 
@@ -4143,7 +4143,7 @@ static int tg3_start_xmit_dma_bug(struct
 	tp->tx_prod = entry;
 	if (unlikely(tg3_tx_avail(tp) <= (MAX_SKB_FRAGS + 1))) {
 		netif_stop_queue(dev);
-		if (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH)
+		if (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp))
 			netif_wake_queue(tp->dev);
 	}
 
@@ -8106,7 +8106,10 @@ static int tg3_set_ringparam(struct net_
 
 	if ((ering->rx_pending > TG3_RX_RING_SIZE - 1) ||
 	    (ering->rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) ||
-	    (ering->tx_pending > TG3_TX_RING_SIZE - 1))
+	    (ering->tx_pending > TG3_TX_RING_SIZE - 1) ||
+	    (ering->tx_pending <= MAX_SKB_FRAGS) ||
+	    ((tp->tg3_flags2 & TG3_FLG2_HW_TSO_1_BUG) &&
+	     (ering->tx_pending <= (MAX_SKB_FRAGS * 3))))
 		return -EINVAL;
 
 	if (netif_running(dev)) {


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
  2006-10-21 17:02 ` tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2 Andrew Morton
@ 2006-10-21 17:18   ` Rafael J. Wysocki
  2006-10-21 23:41   ` Norbert Preining
  1 sibling, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2006-10-21 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Norbert Preining, linux-kernel, netdev, Michael Chan

On Saturday, 21 October 2006 19:02, Andrew Morton wrote:
> 
> cc's added.
> 
> On Sat, 21 Oct 2006 15:22:39 +0200
> Norbert Preining <preining@logic.at> wrote:
> 
> > Hi all!
> > 
> > I get the same bug again and again, always when ifplugd is started:
> > 
> > tg3: eth0: No firmware running.
> > BUG: soft lockup detected on CPU#0!
> >  [<c0103ec7>] dump_trace+0x68/0x1b4
> >  [<c010402b>] show_trace_log_lvl+0x18/0x2c
> >  [<c010463a>] show_trace+0xf/0x11
> >  [<c010469d>] dump_stack+0x12/0x14
> >  [<c0141cb3>] softlockup_tick+0xaa/0xc1
> >  [<c0129bad>] update_process_times+0x3b/0x5e
> >  [<c01362a1>] handle_update_profile+0x14/0x1e
> >  [<c0115956>] smp_apic_timer_interrupt+0x49/0x5b
> >  [<c0103998>] apic_timer_interrupt+0x28/0x30
> > DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
> > Leftover inexact backtrace:
> >  [<c01d03af>] delay_tsc+0xb/0x13
> >  [<c01d03e0>] __delay+0x6/0x7
> >  [<c022ce12>] tg3_readphy+0x6e/0xd5
> >  [<c022e0d1>] tg3_setup_copper_phy+0x30b/0xa15
> >  [<c01064d9>] profile_pc+0x24/0x53
> >  [<c022f475>] tg3_setup_phy+0xc9a/0xd1f
> >  [<c0103998>] apic_timer_interrupt+0x28/0x30
> >  [<c022c240>] _tw32_flush+0x3f/0x51
> >  [<c022dc4a>] tg3_write_mem+0xcf/0xe7
> >  [<c0231683>] tg3_reset_hw+0x10ab/0x13a0
> >  [<c01d03e0>] __delay+0x6/0x7
> >  [<c01d03e0>] __delay+0x6/0x7
> >  [<c022c240>] _tw32_flush+0x3f/0x51
> >  [<c01d03e0>] __delay+0x6/0x7
> >  [<c022da97>] tg3_switch_clocks+0x8f/0x93
> >  [<c0237673>] tg3_open+0x250/0x520
> >  [<c02d3263>] dev_open+0x2b/0x62
> >  [<c02d1dd8>] dev_change_flags+0x47/0xe4
> >  [<c0307fcc>] devinet_ioctl+0x252/0x556
> >  [<c02d2e5a>] dev_ifsioc+0x113/0x38d
> >  [<c02d29c4>] dev_load+0x24/0x4b
> >  [<c02c90c7>] sock_ioctl+0x0/0x1c2
> >  [<c02c9265>] sock_ioctl+0x19e/0x1c2
> >  [<c02ca151>] sock_map_fd+0x41/0x4a
> >  [<c02c90c7>] sock_ioctl+0x0/0x1c2
> >  [<c01684bb>] do_ioctl+0x1f/0x62
> >  [<c0168743>] vfs_ioctl+0x245/0x257
> >  [<c0168788>] sys_ioctl+0x33/0x4b
> >  [<c0102f40>] syscall_call+0x7/0xb
> >  =======================
> > 
> > With  2.6.19-rc2 (no -mm) it does not happen.
> > 
> > Normal dmesg gives:
> > tg3.c:v3.66 (September 23, 2006)
> > PCI: Enabling device 0000:03:00.0 (0000 -> 0002)
> > ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 17 (level, low) -> IRQ 18
> > PCI: Setting latency timer of device 0000:03:00.0 to 64
> > eth0: Tigon3 [partno(BCM95789) rev 4101 PHY(5750)] (PCI Express) 10/100/1000Base
> > T Ethernet 00:16:36:1e:27:ad
> > eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
> > eth0: dma_rwctrl[76180000] dma_mask[64-bit]
> > 
> 
> 
> There are tg3 changes in -mm, but I doubt it they caused this hang.

FWIW, I have a tg3 running just fine with 2.6.19-rc2-mm2, on x86-64.

Greetings,
Rafael


-- 
You never change things by fighting the existing reality.
		R. Buckminster Fuller

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
  2006-10-21 17:02 ` tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2 Andrew Morton
  2006-10-21 17:18   ` Rafael J. Wysocki
@ 2006-10-21 23:41   ` Norbert Preining
  2006-10-22  5:47     ` Michael Chan
  1 sibling, 1 reply; 7+ messages in thread
From: Norbert Preining @ 2006-10-21 23:41 UTC (permalink / raw)
  To: Andrew Morton, Rafael J. Wysocki, David Miller
  Cc: linux-kernel, netdev, Michael Chan

Hi Andrew, hi all!

On Sam, 21 Okt 2006, Andrew Morton wrote:
> Can you test 2.6.19-rc2 plus the below?

2.6.19-rc2	works
2.6.19-rc2+patch does not work

So it is this patch.

hw:
Acer TravelMate 3012WMi
03:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5789 Gigabit Ethernet PCI Express (rev 11)

If you need dmesg, .config, something else, no problem.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining <preining@logic.at>                    Università di Siena
Debian Developer <preining@debian.org>                         Debian TeX Group
gpg DSA: 0x09C5B094      fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
DITHERINGTON (n)
Sudden access to panic experienced by one who realises that he is
being drawn inexorably into a clabby (q.v.) conversion, i.e. one he
has no hope of enjoying, benefiting from or understanding.
			--- Douglas Adams, The Meaning of Liff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
  2006-10-21 23:41   ` Norbert Preining
@ 2006-10-22  5:47     ` Michael Chan
  2006-10-22 16:22       ` Norbert Preining
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2006-10-22  5:47 UTC (permalink / raw)
  To: Norbert Preining, Andrew Morton, Rafael J. Wysocki, David Miller
  Cc: linux-kernel, netdev

Norbert Preining wrote:
> On Sam, 21 Okt 2006, Andrew Morton wrote:
> > Can you test 2.6.19-rc2 plus the below?
> 
> 2.6.19-rc2	works
> 2.6.19-rc2+patch does not work
> 
> So it is this patch.
> 
It doesn't make any sense.  This patch is totally benign and
cannot cause the "No firmware running" and lockup that you
reported.  Can you please double-check?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
       [not found] ` <20061021.123814.106436476.davem@davemloft.net>
@ 2006-10-22  6:36   ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2006-10-22  6:36 UTC (permalink / raw)
  To: David Miller; +Cc: preining, linux-kernel, Michael Chan, netdev

On Sat, 21 Oct 2006 12:38:14 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Norbert Preining <preining@logic.at>
> Date: Sat, 21 Oct 2006 15:22:39 +0200
> 
> >  [<c010469d>] dump_stack+0x12/0x14
> >  [<c0141cb3>] softlockup_tick+0xaa/0xc1
> >  [<c0129bad>] update_process_times+0x3b/0x5e
> >  [<c01362a1>] handle_update_profile+0x14/0x1e
> >  [<c0115956>] smp_apic_timer_interrupt+0x49/0x5b
> >  [<c0103998>] apic_timer_interrupt+0x28/0x30
> > DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
> > Leftover inexact backtrace:
> >  [<c01d03af>] delay_tsc+0xb/0x13
> >  [<c01d03e0>] __delay+0x6/0x7
> 
> It's OOPS'ing by softlockup'ing in udelay() and then we get a corrupt
> backtrace.

The unwinder-based backtrace is screwed up (yet again) but the old-style
backtrace is there, in all its messy glory.

Weeding out the crap, I think it's this:

 [<c01d03af>] delay_tsc+0xb/0x13
 [<c01d03e0>] __delay+0x6/0x7
 [<c022c240>] _tw32_flush+0x3f/0x51
 [<c022da97>] tg3_switch_clocks+0x8f/0x93

 <I assume tg3_init_hw() got inlined>

 [<c0237673>] tg3_open+0x250/0x520
 [<c02d3263>] dev_open+0x2b/0x62
 [<c02d1dd8>] dev_change_flags+0x47/0xe4
 [<c0307fcc>] devinet_ioctl+0x252/0x556
 [<c02d2e5a>] dev_ifsioc+0x113/0x38d
 [<c02d29c4>] dev_load+0x24/0x4b
 [<c02c9265>] sock_ioctl+0x19e/0x1c2

It's strange that the post-2.6.19-rc2 changes triggered this - that code
won't have run yet.

Norbert, are you really sure?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
  2006-10-22  5:47     ` Michael Chan
@ 2006-10-22 16:22       ` Norbert Preining
  2006-10-22 18:13         ` Norbert Preining
  0 siblings, 1 reply; 7+ messages in thread
From: Norbert Preining @ 2006-10-22 16:22 UTC (permalink / raw)
  To: Michael Chan, Andrew Morton
  Cc: Rafael J. Wysocki, David Miller, linux-kernel, netdev

Hi all!

On Sam, 21 Okt 2006, Michael Chan wrote:
> > 2.6.19-rc2	works
> > 2.6.19-rc2+patch does not work
> > 
> It doesn't make any sense.  This patch is totally benign and
> cannot cause the "No firmware running" and lockup that you
> reported.  Can you please double-check?

Ok, I cannot reproduce it anymore. No idea why it happened.

ANyway, with my current rc2+tg3 patch I have no problems, while with
rc2-mm2 I have the problems.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining <preining@logic.at>                    Università di Siena
Debian Developer <preining@debian.org>                         Debian TeX Group
gpg DSA: 0x09C5B094      fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
SHENANDOAH (n.)
The infinite smugness of one who knows they are entitled to a place in
a nuclear bunker.
			--- Douglas Adams, The Meaning of Liff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2
  2006-10-22 16:22       ` Norbert Preining
@ 2006-10-22 18:13         ` Norbert Preining
  0 siblings, 0 replies; 7+ messages in thread
From: Norbert Preining @ 2006-10-22 18:13 UTC (permalink / raw)
  To: Michael Chan, Andrew Morton
  Cc: Rafael J. Wysocki, David Miller, linux-kernel, netdev

Hi all!

Ok, you will lough at me...

On Son, 22 Okt 2006, preining wrote:
> > > 2.6.19-rc2	works
> > > 2.6.19-rc2+patch does not work
> > > 
> > It doesn't make any sense.  This patch is totally benign and
> > cannot cause the "No firmware running" and lockup that you
> > reported.  Can you please double-check?
> 
> Ok, I cannot reproduce it anymore. No idea why it happened.

And again I can reporduce it. How, (again, please don't lough):

I booted into windows (sometimes one has too, contract from the EC with
macros in Excel tables ... grrr).

WinXP didn't mange to get an IP address from my cable modem.

Rebooting into linux, same problem as reported, 

and, but no idea whether this is related: the modem just looses sync and
need several resets until it find back into syncronization.

I will do some more experiments with Win-different linux kernel
switching.

Sorry for the chaos, no idea what has happened here!!!

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining <preining@logic.at>                    Università di Siena
Debian Developer <preining@debian.org>                         Debian TeX Group
gpg DSA: 0x09C5B094      fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
`We've got to find out what people want from fire, how
they relate to it, what sort of image it has for them.'
The crowd were tense. They were expecting something
wonderful from Ford.
`Stick it up your nose,' he said.
`Which is precisely the sort of thing we need to know,'
insisted the girl, `Do people want fire that can be fitted
nasally?'
                 --- Ford "debating" what to do with fire with a marketing
                 --- girl.
                 --- Douglas Adams, The Hitchhikers Guide to the Galaxy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-10-22 18:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20061021132239.GA29288@gamma.logic.tuwien.ac.at>
2006-10-21 17:02 ` tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2 Andrew Morton
2006-10-21 17:18   ` Rafael J. Wysocki
2006-10-21 23:41   ` Norbert Preining
2006-10-22  5:47     ` Michael Chan
2006-10-22 16:22       ` Norbert Preining
2006-10-22 18:13         ` Norbert Preining
     [not found] ` <20061021.123814.106436476.davem@davemloft.net>
2006-10-22  6:36   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).