From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
To: Bjorn Helgaas <bhelgaas@google.com>,
Prashant Sreedharan <prashant@broadcom.com>
Cc: Nils Holland <nholland@tisys.org>,
Michael Chan <mchan@broadcom.com>,
Rajat Jain <rajatxjain@gmail.com>,
David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Rafael Wysocki <rjw@rjwysocki.net>
Subject: Re: [bisected] tg3 broken in 3.18.0?
Date: Fri, 19 Dec 2014 15:16:41 -0200 [thread overview]
Message-ID: <54945D79.2070008@gmail.com> (raw)
In-Reply-To: <CAErSpo4UdxwL=wA8cBJ7_DAdTb2xNeg6600qB0=Jdxv80aVmcg@mail.gmail.com>
On 19-12-2014 15:09, Bjorn Helgaas wrote:
> On Thu, Dec 18, 2014 at 7:10 PM, Prashant Sreedharan
> <prashant@broadcom.com> wrote:
>> On Thu, 2014-12-18 at 21:26 +0100, Nils Holland wrote:
>>> On Thu, Dec 18, 2014 at 11:28:09AM -0800, Prashant Sreedharan wrote:
>>>> On Thu, 2014-12-18 at 12:15 -0700, Bjorn Helgaas wrote:
>>>>>
>>>>> Any updates from the hardware team?
>>>>>
>>>>> This is a pretty serious regression, but as far as I can tell, it is
>>>>> not a PCI bug. The device should respond to a config read of vendor
>>>>> ID. If the driver does something that make the read return CRS
>>>>> status, I think the driver is responsible for doing whatever delay or
>>>>> other fixup is required.
>>>>>
>>>>> I'm inclined to reassign this bug to the tg3 driver unless you think
>>>>> the PCI core is doing something wrong here.
>>>>>
>>>>> Bjorn
>>>>
>>>> We were not able to reproduce this issue, could you please check what is
>>>> the value of reg 0x70, before the pci_device_is_present call is made ?
>>>> if bit 15 is set config access will be retried.
>>>>
>>>> --- a/drivers/net/ethernet/broadcom/tg3.c
>>>> +++ b/drivers/net/ethernet/broadcom/tg3.c
>>>> @@ -9025,6 +9025,7 @@ static int tg3_chip_reset(struct tg3 *tp)
>>>> void (*write_op)(struct tg3 *, u32, u32);
>>>> int i, err;
>>>>
>>>> + printk(KERN_ERR "config state: %x\n", tr32(TG3PCI_PCISTATE));
>>>> if (!pci_device_is_present(tp->pdev))
>>>> return -ENODEV;
>>>
>>> No problem, I gave this a try and here is what I get:
>>>
>>> [ 2.185190] libphy: tg3 mdio bus: probed
>>> [ 2.229357] tsc: Refined TSC clocksource calibration: 2399.999 MHz
>>> [ 2.244993] config state: 1292
>>> [ 2.247136] tg3 0000:02:00.0 eth0: Tigon3 [partno(BCM57780) rev 57780001]
>>> (PCI Express) MAC address 00:19:99:ce:13:a6
>>> [ 2.249279] tg3 0000:02:00.0 eth0: attached PHY driver [Broadcom BCM57780]
>>> (mii_bus:phy_addr=200:01)
>>> [ 2.251460] tg3 0000:02:00.0 eth0: RXcsums[1] LinkChgREG[0]
>>> MIirq[0] ASF[0] TSOcap[1]
>>> [ 2.253672] tg3 0000:02:00.0 eth0: dma_rwctrl[76180000] dma_mask[64-bit]
>>> [...]
>>> [ 12.204692] tg3 0000:02:00.0
>>> enp2s0: No firmware running
>>> [ 12.206653] config state: 1292
>>> [ 12.208655] config state: 1292
>>>
>>> That's all of the three times the new debugging line gets hit when I
>>> boot my system using the supplied diagnostic patch.
>>>
>>> Hope that helps - of course, I'd gladly test any further
>>> (diagnostic) patches if required! Also, if I can provide any
>>> additional information that might be of value, just ask:-)
>>>
>> Nils/Marcelo thanks for inputs, since reg 0x70 bit 15 is clear it
>> indicates the chip is not setting the config retry bit. We were hoping
>> this bit is causing the config access to return CRS but looks like it is
>> not.
>>
>> Btw after forcing the error path (tg3_init_one -> tg3_halt) in the
>> driver now we are able to reproduce the problem on 5722 in house. We are
>> working with the HW team to narrow this down.
>>
>> Also it is not clear to me how reverting commit cfa6a7877b17a667 fixes
>> the problem.
>
> The full commit is 89665a6a71408796565bfd29cfa6a7877b17a667, and git
> works with any unique *prefix* of that. The current convention is to
> use the first 12 characters (I have "[core] abbrev = 12" in my
> .git/config). Unfortunately, suffixes don't work at all.
>
> Anyway, here's why I think 89665a6a7140 makes a difference. We're in this path:
>
> pci_device_is_present
> pci_bus_read_dev_vendor_id(..., crs_timeout = 0)
> pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l)
>
> and for some reason the chip returns 0x00010001 for that 32-bit read.
Actually it returns just 0x00000001, but yeah, that's my understanding too.
Marcelo
> Before 89665a6a7140, we compared all 32 bits with "*l == 0xffff0001".
> This is false, so pci_bus_read_dev_vendor_id() returns true, which
> means pci_device_is_present() is also true.
>
> After 89665a6a7140, we compare only the low 16 bits with ((*l &
> 0xffff) == 0x0001), which is true, so pci_bus_read_dev_vendor_id()
> returns false, and pci_device_is_present() is false.
>
> Bjorn
>
next prev parent reply other threads:[~2014-12-19 17:16 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-13 21:02 [bisected] tg3 broken in 3.18.0? Nils Holland
2014-12-15 15:06 ` Marcelo Ricardo Leitner
2014-12-16 16:04 ` Rajat Jain
2014-12-16 16:20 ` Bjorn Helgaas
2014-12-16 17:15 ` Michael Chan
2014-12-16 17:59 ` Marcelo Ricardo Leitner
2014-12-16 19:54 ` Michael Chan
2014-12-16 20:02 ` Marcelo Ricardo Leitner
2014-12-18 19:15 ` Bjorn Helgaas
2014-12-18 19:28 ` Prashant Sreedharan
2014-12-18 20:09 ` Marcelo Ricardo Leitner
2014-12-18 20:33 ` Marcelo Ricardo Leitner
2014-12-18 20:26 ` Nils Holland
2014-12-19 2:10 ` Prashant Sreedharan
2014-12-19 17:09 ` Bjorn Helgaas
2014-12-19 17:16 ` Marcelo Ricardo Leitner [this message]
2014-12-19 18:24 ` Rajat Jain
2014-12-19 18:53 ` Prashant Sreedharan
2014-12-19 19:37 ` Rajat Jain
2014-12-16 18:00 ` Marcelo Ricardo Leitner
2014-12-16 20:38 ` Nils Holland
2014-12-16 0:31 ` Bjorn Helgaas
-- strict thread matches above, loose matches on Subject: below --
2014-12-10 23:06 Nils Holland
2014-12-11 16:45 ` Marcelo Ricardo Leitner
2014-12-12 14:50 ` Jonathan Bither
2014-12-12 20:31 ` Nils Holland
2014-12-13 1:14 ` [bisected] " Nils Holland
2014-12-13 1:18 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54945D79.2070008@gmail.com \
--to=marcelo.leitner@gmail.com \
--cc=bhelgaas@google.com \
--cc=davem@davemloft.net \
--cc=linux-pci@vger.kernel.org \
--cc=mchan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=nholland@tisys.org \
--cc=prashant@broadcom.com \
--cc=rajatxjain@gmail.com \
--cc=rjw@rjwysocki.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).