netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nils Holland <nholland@tisys.org>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, linux-pci@vger.kernel.org, rajatxjain@gmail.com
Subject: Re: [bisected] tg3 broken in 3.18.0?
Date: Sat, 13 Dec 2014 22:02:51 +0100	[thread overview]
Message-ID: <20141213210251.GA12812@teela.fritz.box> (raw)

rajatxjain@gmail.com
Bcc: 
Subject: Re: [bisected] tg3 broken in 3.18.0?
Reply-To: 
In-Reply-To: <20141212.201831.186234837340644301.davem@davemloft.net>

On Fri, Dec 12, 2014 at 08:18:31PM -0500, David Miller wrote:
> From: Nils Holland <nholland@tisys.org>
> Date: Sat, 13 Dec 2014 02:14:08 +0100
> 
> > 
> > My bisect exercise suggests that the following commit is the culprit:
> > 
> > 89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
> > ID to identify Configuration Request Retry)
> 
> You definitely need to bring this up with the author of that change
> and the relevent list for the PCI subsystem and/or linux-kernel.

I've now already sent an inquiry to Rajat Jain, the author of the
patch in question, and this message here is now also CC'd to
linux-pci@.

With this message, I'd like to add one last result of investigation
I've done today, in the hope that it will aid the folks with more
knowledge to go after the issue.

Basically, I've added a little debug output to tg3.c in the function
tg3_poll_fw(), as that function contained the code that would print
out the "No firmware running" line that was visible in dmesg on those
kernels where tg3 would not work for me. So, I basically had this:

static int tg3_poll_fw(struct tg3 *tp)
{
        int i;
        u32 val;

        netdev_info(tp->dev, "XX: Boom!\n");
        [...]
}

Now, I was looking through dmesg searching for occurances of this
debug output, using a standard 3.18.0 kernel (where my tg3 doesn't
work) as well as using a 3.18.0 kernel with
89665a6a71408796565bfd29cfa6a7877b17a667 reverted (where my tg3
works). Here's the results:

[standard 3.18.0 (=problematic)]:
[    2.197653] libphy: tg3 mdio bus: probed
[    2.257488] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    2.259589] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    2.261740] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    2.263912] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[   10.028002] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[   10.028247] tg3 0000:02:00.0 enp2s0: XX: Boom!
[   12.157034] tg3 0000:02:00.0 enp2s0: No firmware running


[3.18.0 without above mentioned patch, 3.17.3 is the same, both result
in a working tg3]:
[    1.397167] libphy: tg3 mdio bus: probed
[    1.456473] tg3 0000:02:00.0
        (unnamed net_device) (uninitialized): XX: Boom!
[    1.464987] tg3 0000:02:00.0 eth0:
        Tigon3 [partno(BCM57780) rev 57780001] (PCI Express) MAC address
        00:19:99:ce:13:a6
[    1.467118] tg3 0000:02:00.0 eth0:
        attached PHY driver [Broadcom BCM57780] (mii_bus:phy_addr=200:01)
[    1.469311] tg3 0000:02:00.0 eth0:
        RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[    1.471500] tg3 0000:02:00.0 eth0:
        dma_rwctrl[76180000] dma_mask[64-bit]
[...]
[    9.631629] tg3 0000:02:00.0: irq 25 for MSI/MSI-X
[    9.631962] tg3 0000:02:00.0 enp2s0: XX: Boom!
[    9.634339] tg3 0000:02:00.0 enp2s0: XX: Boom!
[    9.642741] IPv6:
        ADDRCONF(NETDEV_UP): enp2s0: link is not ready
[   10.479636] tg3 0000:02:00.0
        enp2s0: Link is down
[   11.484498] tg3 0000:02:00.0
        enp2s0: Link is up at 100 Mbps, full duplex

As can be seen, there are two tg3-related sections in my dmesg in both
the working and non-working scenarios: At about 1 - 2 secs, the card
seems to begin initializing, and at about 9 - 10 seconds it is (or
should be) ready to establish a network connection.

My debug section, or tg3.c's tg3_poll_fw(), seems to be called thrice
in the working situation: The first hit occurs at 1.456473 where the tg3
device is still reported as "(unnamed net_device) (uninitialized)".
Then, the section gets hit twice again at around 9.63 - at this point
the driver already reports the card as initialized / by its real name.

In the non-working situation, the debug sections seems to be hit only
once, at 10.028247. At this point, the tg3 is already reported as
initialized - just like when it's hit the second and third time in the
working situation.

Bottom line is that commit 89665a6a71408796565bfd29cfa6a7877b17a667
really makes a difference regarding the way the tg3 card is
initialized, which seems to cause the problem.

Greetings,
Nils

             reply	other threads:[~2014-12-13 21:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-13 21:02 Nils Holland [this message]
2014-12-15 15:06 ` [bisected] tg3 broken in 3.18.0? Marcelo Ricardo Leitner
2014-12-16 16:04   ` Rajat Jain
2014-12-16 16:20     ` Bjorn Helgaas
2014-12-16 17:15       ` Michael Chan
2014-12-16 17:59         ` Marcelo Ricardo Leitner
2014-12-16 19:54           ` Michael Chan
2014-12-16 20:02             ` Marcelo Ricardo Leitner
2014-12-18 19:15             ` Bjorn Helgaas
2014-12-18 19:28               ` Prashant Sreedharan
2014-12-18 20:09                 ` Marcelo Ricardo Leitner
2014-12-18 20:33                   ` Marcelo Ricardo Leitner
2014-12-18 20:26                 ` Nils Holland
2014-12-19  2:10                   ` Prashant Sreedharan
2014-12-19 17:09                     ` Bjorn Helgaas
2014-12-19 17:16                       ` Marcelo Ricardo Leitner
2014-12-19 18:24                         ` Rajat Jain
2014-12-19 18:53                           ` Prashant Sreedharan
2014-12-19 19:37                             ` Rajat Jain
2014-12-16 18:00     ` Marcelo Ricardo Leitner
2014-12-16 20:38       ` Nils Holland
2014-12-16  0:31 ` Bjorn Helgaas
  -- strict thread matches above, loose matches on Subject: below --
2014-12-10 23:06 Nils Holland
2014-12-11 16:45 ` Marcelo Ricardo Leitner
2014-12-12 14:50   ` Jonathan Bither
2014-12-12 20:31     ` Nils Holland
2014-12-13  1:14       ` [bisected] " Nils Holland
2014-12-13  1:18         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141213210251.GA12812@teela.fritz.box \
    --to=nholland@tisys.org \
    --cc=davem@davemloft.net \
    --cc=linux-pci@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rajatxjain@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).