netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* tg3 broken in 3.18.0?
@ 2014-12-10 23:06 Nils Holland
  2014-12-11 16:45 ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 6+ messages in thread
From: Nils Holland @ 2014-12-10 23:06 UTC (permalink / raw)
  To: netdev

Hi everyone,

I just upgraded a machine from 3.17.3 to 3.18.0 and noticed that after
the upgrade, the machine's network interface (which is a tg3) would no
longer run correctly (or, for that matter, run at all). During the
upgrade, I didn't change any kernel config options or any other parts
of the system.

Since the machine is remote and I don't have direct access to it, it's
kind of hard currently to give more details, but here's what I'm
seeing in the logs:

[Booting 3.17.3:]
[    1.383151] tg3.c:v3.137 (May 11, 2014)
[    1.387296] libphy: tg3 mdio bus: probed
[    1.452600] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    1.454660] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    1.456764] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    1.458911] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[    6.602608] tg3 0000:02:00.0
        enp2s0: renamed from eth0
[    9.865638] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[    9.887584] IPv6:
        ADDRCONF(NETDEV_UP): enp2s0: link is not ready
[   10.469819] tg3 0000:02:00.0
        enp2s0: Link is down
[   12.477396] tg3 0000:02:00.0
        enp2s0: Link is up at 100 Mbps, full duplex
[   12.477404] tg3 0000:02:00.0
        enp2s0: Flow control is off for TX and off for RX

[Booting 3.18.0:]
[    2.192915] tg3.c:v3.137 (May 11, 2014)
[    2.196767] libphy: tg3 mdio bus: probed
[    2.256294] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    2.258387] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    2.260530] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    2.262679] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[    7.431176] tg3 0000:02:00.0
        enp2s0: renamed from eth0
[   10.422839] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[   12.530363] tg3 0000:02:00.0
        enp2s0: No firmware running

That's the last thing I find about the card in the logs, the machine
will then just sit there, working normally but being unreachable from
the network.

If I see things correctly, there were only two patches affecting tg3
between 3.17(.3) and 3.18:

2c7c9ea429ba30fe506747b7da110e2212d8fefa
a620a6bc1c94c22d6c312892be1e0ae171523125

The affected machine being, like I said, remote, I've not yet been
able to do more thorough tests. So I thought I'd report the issue and
see if someone else has also seen it already, or can test things with
a more easily accesible machine. Otherwise, I might start digging
deeper.

Greetings,
Nils

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tg3 broken in 3.18.0?
  2014-12-10 23:06 tg3 broken in 3.18.0? Nils Holland
@ 2014-12-11 16:45 ` Marcelo Ricardo Leitner
  2014-12-12 14:50   ` Jonathan Bither
  0 siblings, 1 reply; 6+ messages in thread
From: Marcelo Ricardo Leitner @ 2014-12-11 16:45 UTC (permalink / raw)
  To: Nils Holland, netdev

On 10-12-2014 21:06, Nils Holland wrote:
> Hi everyone,
>
> I just upgraded a machine from 3.17.3 to 3.18.0 and noticed that after
> the upgrade, the machine's network interface (which is a tg3) would no
> longer run correctly (or, for that matter, run at all). During the
> upgrade, I didn't change any kernel config options or any other parts
> of the system.

Same thing here! Thanks for reporting this, Nils.

> Since the machine is remote and I don't have direct access to it, it's
> kind of hard currently to give more details, but here's what I'm
> seeing in the logs:

I have access to mine, kudos to secondary NIC.

$ ethtool -i p1p1
driver: tg3
version: 3.137
firmware-version: 5722-v3.13
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

$ ethtool p1p1
Settings for p1p1:
         Supported ports: [ TP ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Supported pause frame use: No
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Advertised pause frame use: Symmetric
         Advertised auto-negotiation: Yes
         Speed: Unknown!
         Duplex: Unknown! (255)
         Port: Twisted Pair
         PHYAD: 1
         Transceiver: internal
         Auto-negotiation: on
         MDI-X: Unknown

$ sudo ip link set p1p1 up
RTNETLINK answers: No such device

> If I see things correctly, there were only two patches affecting tg3
> between 3.17(.3) and 3.18:
>
> 2c7c9ea429ba30fe506747b7da110e2212d8fefa
> a620a6bc1c94c22d6c312892be1e0ae171523125

I'm running net-next, 395eea6ccf2b253f81b4718ffbcae67d36fe2e69.
So my diffs would be:
$ git log v3.17..origin/master --oneline -- drivers/net/ethernet/broadcom/tg3.c
892311f ethtool: Support for configurable RSS hash function
60b7379 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
a620a6b tg3: fix ring init when there are more TX than RX channels
3964835 tg3: use netdev_rss_key_fill() helper
2c7c9ea tg3: Add skb->xmit_more support

Reverting all these, issue continues.

If no one has a better shot, I'll try bissecting later.

   Marcelo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tg3 broken in 3.18.0?
  2014-12-11 16:45 ` Marcelo Ricardo Leitner
@ 2014-12-12 14:50   ` Jonathan Bither
  2014-12-12 20:31     ` Nils Holland
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Bither @ 2014-12-12 14:50 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Nils Holland, netdev

Not sure if it helps any, but tg3 works here after a 3.18 upgrade. I'd 
be happy to share any information if it would help you out.

[root@www ~]# uname -a
Linux localhost 3.18.0-1.el6.elrepo.i686 #1 SMP Mon Dec 8 10:55:34 EST 
2014 i686 i686 i386 GNU/Linux
[root@www ~]# ethtool -i eth0
driver: tg3
version: 3.137
firmware-version: 5704-v3.36, ASFIPMIc v2.37
bus-info: 0000:02:03.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
[root@www ~]# ethtool eth0
Settings for eth0:
	Supported ports: [ TP ]
	Supported link modes:   10baseT/Half 10baseT/Full
	                        100baseT/Half 100baseT/Full
	                        1000baseT/Half 1000baseT/Full
	Supported pause frame use: No
	Supports auto-negotiation: Yes
	Advertised link modes:  10baseT/Half 10baseT/Full
	                        100baseT/Half 100baseT/Full
	                        1000baseT/Half 1000baseT/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: Yes
	Link partner advertised link modes:  10baseT/Half 10baseT/Full
	                                     100baseT/Half 100baseT/Full
	                                     1000baseT/Full
	Link partner advertised pause frame use: No
	Link partner advertised auto-negotiation: Yes
	Speed: 1000Mb/s
	Duplex: Full
	Port: Twisted Pair
	PHYAD: 1
	Transceiver: internal
	Auto-negotiation: on
	MDI-X: on
	Supports Wake-on: g
	Wake-on: g
	Current message level: 0x000000ff (255)
			       drv probe link timer ifdown ifup rx_err tx_err
	Link detected: yes
[root@www ~]#



On 12/11/2014 11:45 AM, Marcelo Ricardo Leitner wrote:
> On 10-12-2014 21:06, Nils Holland wrote:
>> Hi everyone,
>>
>> I just upgraded a machine from 3.17.3 to 3.18.0 and noticed that after
>> the upgrade, the machine's network interface (which is a tg3) would no
>> longer run correctly (or, for that matter, run at all). During the
>> upgrade, I didn't change any kernel config options or any other parts
>> of the system.
>
> Same thing here! Thanks for reporting this, Nils.
>
>> Since the machine is remote and I don't have direct access to it, it's
>> kind of hard currently to give more details, but here's what I'm
>> seeing in the logs:
>
> I have access to mine, kudos to secondary NIC.
>
> $ ethtool -i p1p1
> driver: tg3
> version: 3.137
> firmware-version: 5722-v3.13
> bus-info: 0000:02:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
>
> $ ethtool p1p1
> Settings for p1p1:
>          Supported ports: [ TP ]
>          Supported link modes:   10baseT/Half 10baseT/Full
>                                  100baseT/Half 100baseT/Full
>                                  1000baseT/Half 1000baseT/Full
>          Supported pause frame use: No
>          Supports auto-negotiation: Yes
>          Advertised link modes:  10baseT/Half 10baseT/Full
>                                  100baseT/Half 100baseT/Full
>                                  1000baseT/Half 1000baseT/Full
>          Advertised pause frame use: Symmetric
>          Advertised auto-negotiation: Yes
>          Speed: Unknown!
>          Duplex: Unknown! (255)
>          Port: Twisted Pair
>          PHYAD: 1
>          Transceiver: internal
>          Auto-negotiation: on
>          MDI-X: Unknown
>
> $ sudo ip link set p1p1 up
> RTNETLINK answers: No such device
>
>> If I see things correctly, there were only two patches affecting tg3
>> between 3.17(.3) and 3.18:
>>
>> 2c7c9ea429ba30fe506747b7da110e2212d8fefa
>> a620a6bc1c94c22d6c312892be1e0ae171523125
>
> I'm running net-next, 395eea6ccf2b253f81b4718ffbcae67d36fe2e69.
> So my diffs would be:
> $ git log v3.17..origin/master --oneline --
> drivers/net/ethernet/broadcom/tg3.c
> 892311f ethtool: Support for configurable RSS hash function
> 60b7379 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
> a620a6b tg3: fix ring init when there are more TX than RX channels
> 3964835 tg3: use netdev_rss_key_fill() helper
> 2c7c9ea tg3: Add skb->xmit_more support
>
> Reverting all these, issue continues.
>
> If no one has a better shot, I'll try bissecting later.
>
>    Marcelo
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: tg3 broken in 3.18.0?
  2014-12-12 14:50   ` Jonathan Bither
@ 2014-12-12 20:31     ` Nils Holland
  2014-12-13  1:14       ` [bisected] " Nils Holland
  0 siblings, 1 reply; 6+ messages in thread
From: Nils Holland @ 2014-12-12 20:31 UTC (permalink / raw)
  To: Jonathan Bither; +Cc: netdev

On Fri, Dec 12, 2014 at 09:50:53AM -0500, Jonathan Bither wrote:
> Not sure if it helps any, but tg3 works here after a 3.18 upgrade. I'd 
> be happy to share any information if it would help you out.

What I get here is this (output captured under 3.17.3):

triton513 ~ # ethtool -i enp2s0
driver: tg3
version: 3.137
firmware-version: sb
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

So, you, Marcelo and me, we all seem to have different firmware
versions. If I'm correct, different versions of tg3 exist that either
contain the firmware onboard or get it injected at driver
initialization (correct me if I'm wrong). If Marcelo's and my fw
version had been the same this might have given a clue, but nope.

I'm putting my faith in Marcelo bisecting and finding out more details.
I might try that as well over the weekend, at least to the extent
possible without ever being able to have live access to the machine
when it is running a kernel exhibiting this issue.

Greetings,
Nils

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bisected] tg3 broken in 3.18.0?
  2014-12-12 20:31     ` Nils Holland
@ 2014-12-13  1:14       ` Nils Holland
  2014-12-13  1:18         ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Nils Holland @ 2014-12-13  1:14 UTC (permalink / raw)
  To: netdev

Ok folks,

I now took the time to bisect the issue that killed the tg3 network
interface on one of my boxes in 3.18.0. Beside me, at least one other
person was affected, although we also have a confirmed report of
another person using tg3 without issues under 3.18.0.

My bisect exercise suggests that the following commit is the culprit:

89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
ID to identify Configuration Request Retry)

In case that rings a bell for anyone, I'd be more than glad to hear
about it! Otherwise, while I'm no expert at this, I'll do some more
investigations tomorrow. It's gotten kind of late during bisecting and
I'm off for some sleep now. ;-)

Greetings,
Nils

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bisected] tg3 broken in 3.18.0?
  2014-12-13  1:14       ` [bisected] " Nils Holland
@ 2014-12-13  1:18         ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2014-12-13  1:18 UTC (permalink / raw)
  To: nholland; +Cc: netdev

From: Nils Holland <nholland@tisys.org>
Date: Sat, 13 Dec 2014 02:14:08 +0100

> Ok folks,
> 
> I now took the time to bisect the issue that killed the tg3 network
> interface on one of my boxes in 3.18.0. Beside me, at least one other
> person was affected, although we also have a confirmed report of
> another person using tg3 without issues under 3.18.0.
> 
> My bisect exercise suggests that the following commit is the culprit:
> 
> 89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
> ID to identify Configuration Request Retry)
> 
> In case that rings a bell for anyone, I'd be more than glad to hear
> about it! Otherwise, while I'm no expert at this, I'll do some more
> investigations tomorrow. It's gotten kind of late during bisecting and
> I'm off for some sleep now. ;-)

You definitely need to bring this up with the author of that change
and the relevent list for the PCI subsystem and/or linux-kernel.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-13  1:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-10 23:06 tg3 broken in 3.18.0? Nils Holland
2014-12-11 16:45 ` Marcelo Ricardo Leitner
2014-12-12 14:50   ` Jonathan Bither
2014-12-12 20:31     ` Nils Holland
2014-12-13  1:14       ` [bisected] " Nils Holland
2014-12-13  1:18         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).