netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
To: Prashant Sreedharan <prashant@broadcom.com>,
	Bjorn Helgaas <bhelgaas@google.com>
Cc: Michael Chan <mchan@broadcom.com>,
	Rajat Jain <rajatxjain@gmail.com>,
	Nils Holland <nholland@tisys.org>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Rafael Wysocki <rjw@rjwysocki.net>
Subject: Re: [bisected] tg3 broken in 3.18.0?
Date: Thu, 18 Dec 2014 18:33:58 -0200	[thread overview]
Message-ID: <54933A36.7010000@gmail.com> (raw)
In-Reply-To: <54933491.7020204@gmail.com>

On 18-12-2014 18:09, Marcelo Ricardo Leitner wrote:
> On 18-12-2014 17:28, Prashant Sreedharan wrote:
>> On Thu, 2014-12-18 at 12:15 -0700, Bjorn Helgaas wrote:
>>> On Tue, Dec 16, 2014 at 12:54 PM, Michael Chan <mchan@broadcom.com> wrote:
>>>> On Tue, 2014-12-16 at 15:59 -0200, Marcelo Ricardo Leitner wrote:
>>>>> It's a
>>>>> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722
>>>>> Gigabit Ethernet PCI Express
>>>>> over here
>>>>>
>>>>> I put a WARN_ON(1) after those printks, and this is what I got:
>>>>>
>>>>> [    1.550640] pci 0000:02:00.0: 1st 1 1
>>>>> [    1.550643] pci 0000:02:00.0: crs_timeout: 0
>>>>> [    1.550645] ------------[ cut here ]------------
>>>>> [    1.550651] WARNING: CPU: 6 PID: 364 at drivers/pci/probe.c:1445 pci_bus_read_dev_vendor_id+0x1d4/0x1e0()
>>>>> [    1.550652] Modules linked in: i915(+) raid0 i2c_algo_bit drm_kms_helper drm e1000e(+) tg3(+) ptp pps_core video
>>>>> [    1.550660] CPU: 6 PID: 364 Comm: systemd-udevd Not tainted 3.18.0-rc6+ #8
>>>>> [    1.550661] Hardware name: Dell Inc. OptiPlex 9010/03K80F, BIOS A15 08/12/2013
>>>>> [    1.550662]  0000000000000000 000000004de2d8dc ffff8807eabdf948 ffffffff8173db46
>>>>> [    1.550665]  0000000000000000 0000000000000000 ffff8807eabdf988 ffffffff81094d41
>>>>> [    1.550667]  ffff8807eabdf968 ffff8807f1e27000 0000000000000000 0000000000000000
>>>>> [    1.550669] Call Trace:
>>>>> [    1.550675]  [<ffffffff8173db46>] dump_stack+0x46/0x58
>>>>> [    1.550679]  [<ffffffff81094d41>] warn_slowpath_common+0x81/0xa0
>>>>> [    1.550681]  [<ffffffff81094e5a>] warn_slowpath_null+0x1a/0x20
>>>>> [    1.550683]  [<ffffffff813b2864>] pci_bus_read_dev_vendor_id+0x1d4/0x1e0
>>>>> [    1.550687]  [<ffffffff813b7c3e>] pci_device_is_present+0x2e/0x50
>>>>> [    1.550693]  [<ffffffffa003364f>] tg3_chip_reset+0x2f/0x940 [tg3]
>>>>> [    1.550697]  [<ffffffffa0033f9f>] tg3_halt+0x3f/0x1e0 [tg3]
>>>>> [    1.550701]  [<ffffffffa0044f83>] tg3_init_one+0xb83/0x1a40 [tg3]
>>>>
>>>> So does it work if you use a non-zero crs_timeout?  The driver has
>>>> called tg3_halt() which may affect configuration read responses.  I need
>>>> to check with the hardware team to see if the 5722 will return CRS in
>>>> this scenario.
>>>
>>> Any updates from the hardware team?
>>>
>>> This is a pretty serious regression, but as far as I can tell, it is
>>> not a PCI bug.  The device should respond to a config read of vendor
>>> ID.  If the driver does something that make the read return CRS
>>> status, I think the driver is responsible for doing whatever delay or
>>> other fixup is required.
>>>
>>> I'm inclined to reassign this bug to the tg3 driver unless you think
>>> the PCI core is doing something wrong here.
>>>
>>> Bjorn
>>
>> We were not able to reproduce this issue, could you please check what is
>> the value of reg 0x70, before the pci_device_is_present call is made ?
>> if bit 15 is set config access will be retried.
>>
>> --- a/drivers/net/ethernet/broadcom/tg3.c
>> +++ b/drivers/net/ethernet/broadcom/tg3.c
>> @@ -9025,6 +9025,7 @@ static int tg3_chip_reset(struct tg3 *tp)
>>           void (*write_op)(struct tg3 *, u32, u32);
>>           int i, err;
>>
>> +       printk(KERN_ERR "config state: %x\n", tr32(TG3PCI_PCISTATE));
>>           if (!pci_device_is_present(tp->pdev))
>>                   return -ENODEV;
>>
>
> With that PCI patch applied and my debugs, without the timeout hack (so crs_timeout=0):
>
> [    1.545554] config state: 12b2
> [    1.548636] pci 0000:02:00.0: 1st 1 1
> [    1.548637] pci 0000:02:00.0: crs_timeout: 0
> [    1.548783] tg3 0000:02:00.0 eth0: Tigon3 [partno(BCM95722) rev a200] (PCI Express) MAC address 00:0a:f7:2b:9b:39
> [    1.548785] tg3 0000:02:00.0 eth0: attached PHY is 5722/5756 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> [    1.548786] tg3 0000:02:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> [    1.548787] tg3 0000:02:00.0 eth0: dma_rwctrl[76180000] dma_mask[64-bit]
> [    1.554389] tg3 0000:02:00.0 p1p1: renamed from eth0
> ...
>
> That's the only time your printk got printed.

My bad, I forgot I had configured the system to not bring that iface up 
anymore.. when doing so, just like Nils had too:

[ 1743.678714] tg3 0000:02:00.0: irq 32 for MSI/MSI-X
[ 1745.554039] tg3 0000:02:00.0 p1p1: No firmware running
[ 1745.554724] config state: 12b2
[ 1745.557822] pci 0000:02:00.0: 1st 1 1
[ 1745.557827] pci 0000:02:00.0: crs_timeout: 0
[ 1745.559383] config state: 12b2
[ 1745.562470] pci 0000:02:00.0: 1st 1 1
[ 1745.562471] pci 0000:02:00.0: crs_timeout: 0

   Marcelo

  reply	other threads:[~2014-12-18 20:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-13 21:02 [bisected] tg3 broken in 3.18.0? Nils Holland
2014-12-15 15:06 ` Marcelo Ricardo Leitner
2014-12-16 16:04   ` Rajat Jain
2014-12-16 16:20     ` Bjorn Helgaas
2014-12-16 17:15       ` Michael Chan
2014-12-16 17:59         ` Marcelo Ricardo Leitner
2014-12-16 19:54           ` Michael Chan
2014-12-16 20:02             ` Marcelo Ricardo Leitner
2014-12-18 19:15             ` Bjorn Helgaas
2014-12-18 19:28               ` Prashant Sreedharan
2014-12-18 20:09                 ` Marcelo Ricardo Leitner
2014-12-18 20:33                   ` Marcelo Ricardo Leitner [this message]
2014-12-18 20:26                 ` Nils Holland
2014-12-19  2:10                   ` Prashant Sreedharan
2014-12-19 17:09                     ` Bjorn Helgaas
2014-12-19 17:16                       ` Marcelo Ricardo Leitner
2014-12-19 18:24                         ` Rajat Jain
2014-12-19 18:53                           ` Prashant Sreedharan
2014-12-19 19:37                             ` Rajat Jain
2014-12-16 18:00     ` Marcelo Ricardo Leitner
2014-12-16 20:38       ` Nils Holland
2014-12-16  0:31 ` Bjorn Helgaas
  -- strict thread matches above, loose matches on Subject: below --
2014-12-10 23:06 Nils Holland
2014-12-11 16:45 ` Marcelo Ricardo Leitner
2014-12-12 14:50   ` Jonathan Bither
2014-12-12 20:31     ` Nils Holland
2014-12-13  1:14       ` [bisected] " Nils Holland
2014-12-13  1:18         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54933A36.7010000@gmail.com \
    --to=marcelo.leitner@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=linux-pci@vger.kernel.org \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=nholland@tisys.org \
    --cc=prashant@broadcom.com \
    --cc=rajatxjain@gmail.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).