* [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
@ 2013-12-09 18:43 Nat Gurumoorthy
2013-12-09 19:00 ` Michael Chan
0 siblings, 1 reply; 13+ messages in thread
From: Nat Gurumoorthy @ 2013-12-09 18:43 UTC (permalink / raw)
To: nsujir, mchan; +Cc: netdev, linux-kernel, Nat Gurumoorthy
The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
uninitialized. From power on reset this register may have garbage in it. The
Register Base Address register defines the device local address of a
register. The data pointed to by this location is read or written using
the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
garbage any read or write of Register Data Register (PCI 128) will cause the
PCI bus to lock up. The TCO watchdog will fire and bring down the system.
Signed-off-by: Nat Gurumoorthy <natg@google.com>
---
drivers/net/ethernet/broadcom/tg3.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 369b736..9a904fd 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -16499,6 +16499,9 @@ static int tg3_get_invariants(struct tg3 *tp, const struct pci_device_id *ent)
/* Clear this out for sanity. */
tw32(TG3PCI_MEM_WIN_BASE_ADDR, 0);
+ /* Clear TG3PCI_REG_BASE_ADDR to prevent hangs. */
+ tw32(TG3PCI_REG_BASE_ADDR, 0);
+
pci_read_config_dword(tp->pdev, TG3PCI_PCISTATE,
&pci_state_reg);
if ((pci_state_reg & PCISTATE_CONV_PCI_MODE) == 0 &&
--
1.8.5.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-09 18:43 [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 Nat Gurumoorthy
@ 2013-12-09 19:00 ` Michael Chan
2013-12-09 19:22 ` Natarajan Gurumoorthy
0 siblings, 1 reply; 13+ messages in thread
From: Michael Chan @ 2013-12-09 19:00 UTC (permalink / raw)
To: Nat Gurumoorthy; +Cc: nsujir, netdev, linux-kernel
On Mon, 2013-12-09 at 10:43 -0800, Nat Gurumoorthy wrote:
> The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
> uninitialized. From power on reset this register may have garbage in it. The
> Register Base Address register defines the device local address of a
> register. The data pointed to by this location is read or written using
> the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
> garbage any read or write of Register Data Register (PCI 128) will cause the
> PCI bus to lock up. The TCO watchdog will fire and bring down the system.
>
>
Is this to prevent problem from other software that may be scanning the
PCI config space? It won't help if this happens before the tg3 driver
is loaded, right?
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-09 19:00 ` Michael Chan
@ 2013-12-09 19:22 ` Natarajan Gurumoorthy
2013-12-09 21:18 ` Sergei Shtylyov
0 siblings, 1 reply; 13+ messages in thread
From: Natarajan Gurumoorthy @ 2013-12-09 19:22 UTC (permalink / raw)
To: Michael Chan; +Cc: nsujir, netdev, linux-kernel@vger.kernel.org
Michael,
We had crashes when the PCI config space got scanned via
/sys/devices/pcixxxx/....../config.
I agree that this fix will not help if the scan happens before the tg3
driver gets loaded.
Regards
Nat
On Mon, Dec 9, 2013 at 11:00 AM, Michael Chan <mchan@broadcom.com> wrote:
> On Mon, 2013-12-09 at 10:43 -0800, Nat Gurumoorthy wrote:
>> The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
>> uninitialized. From power on reset this register may have garbage in it. The
>> Register Base Address register defines the device local address of a
>> register. The data pointed to by this location is read or written using
>> the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
>> garbage any read or write of Register Data Register (PCI 128) will cause the
>> PCI bus to lock up. The TCO watchdog will fire and bring down the system.
>>
>>
> Is this to prevent problem from other software that may be scanning the
> PCI config space? It won't help if this happens before the tg3 driver
> is loaded, right?
>
> Thanks.
>
--
Regards
Nat Gurumoorthy AB6SJ
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-09 21:18 ` Sergei Shtylyov
@ 2013-12-09 21:07 ` Michael Chan
2013-12-10 6:58 ` Michael Chan
0 siblings, 1 reply; 13+ messages in thread
From: Michael Chan @ 2013-12-09 21:07 UTC (permalink / raw)
To: Sergei Shtylyov
Cc: Natarajan Gurumoorthy, nsujir, netdev,
linux-kernel@vger.kernel.org
On Tue, 2013-12-10 at 00:18 +0300, Sergei Shtylyov wrote:
> > We had crashes when the PCI config space got scanned via
> > /sys/devices/pcixxxx/....../config.
>
> > I agree that this fix will not help if the scan happens before the tg3
> > driver gets loaded.
>
> Then perhaps a better place for such fixup would be a PCI quirk?
>
Yes, I agree. Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-09 19:22 ` Natarajan Gurumoorthy
@ 2013-12-09 21:18 ` Sergei Shtylyov
2013-12-09 21:07 ` Michael Chan
0 siblings, 1 reply; 13+ messages in thread
From: Sergei Shtylyov @ 2013-12-09 21:18 UTC (permalink / raw)
To: Natarajan Gurumoorthy, Michael Chan
Cc: nsujir, netdev, linux-kernel@vger.kernel.org
Hello.
On 12/09/2013 10:22 PM, Natarajan Gurumoorthy wrote:
> Michael,
> We had crashes when the PCI config space got scanned via
> /sys/devices/pcixxxx/....../config.
> I agree that this fix will not help if the scan happens before the tg3
> driver gets loaded.
Then perhaps a better place for such fixup would be a PCI quirk?
> Regards
> Nat
> On Mon, Dec 9, 2013 at 11:00 AM, Michael Chan <mchan@broadcom.com> wrote:
>> On Mon, 2013-12-09 at 10:43 -0800, Nat Gurumoorthy wrote:
>>> The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
>>> uninitialized. From power on reset this register may have garbage in it. The
>>> Register Base Address register defines the device local address of a
>>> register. The data pointed to by this location is read or written using
>>> the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
>>> garbage any read or write of Register Data Register (PCI 128) will cause the
>>> PCI bus to lock up. The TCO watchdog will fire and bring down the system.
>> Is this to prevent problem from other software that may be scanning the
>> PCI config space? It won't help if this happens before the tg3 driver
>> is loaded, right?
>> Thanks.
WBR, Sergei
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-09 21:07 ` Michael Chan
@ 2013-12-10 6:58 ` Michael Chan
2013-12-10 15:01 ` Natarajan Gurumoorthy
2013-12-10 18:43 ` David Miller
0 siblings, 2 replies; 13+ messages in thread
From: Michael Chan @ 2013-12-10 6:58 UTC (permalink / raw)
To: Sergei Shtylyov
Cc: Natarajan Gurumoorthy, nsujir, netdev,
linux-kernel@vger.kernel.org
On Mon, 2013-12-09 at 13:07 -0800, Michael Chan wrote:
> On Tue, 2013-12-10 at 00:18 +0300, Sergei Shtylyov wrote:
> > > We had crashes when the PCI config space got scanned via
> > > /sys/devices/pcixxxx/....../config.
> >
> > > I agree that this fix will not help if the scan happens before the tg3
> > > driver gets loaded.
> >
> > Then perhaps a better place for such fixup would be a PCI quirk?
> >
> Yes, I agree. Thanks.
>
On second thought, I think your original patch should be sufficient and
we don't need to add the PCI quirk to cover so many devices. The reason
is that indirect access needs to be explicitly enabled in the
MISC_HOST_CTRL (0x68) register. The default value for register 0x68
should have indirect access disabled.
Nat, does this match what you're seeing? Did you ever see any system
crash before tg3 loads?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-10 6:58 ` Michael Chan
@ 2013-12-10 15:01 ` Natarajan Gurumoorthy
2013-12-10 18:40 ` Michael Chan
2013-12-10 18:43 ` David Miller
1 sibling, 1 reply; 13+ messages in thread
From: Natarajan Gurumoorthy @ 2013-12-10 15:01 UTC (permalink / raw)
To: Michael Chan
Cc: Sergei Shtylyov, nsujir, netdev, linux-kernel@vger.kernel.org
Michael,
The only time I see crashes is after the tg3 driver has been
loaded into the system. I our use case we are poking around
/sys/devices/pcixxxx/......../config.
I guess you will incorporate the original patch into the driver
and we can abandon this patch.
Regards
Nat
On Mon, Dec 9, 2013 at 10:58 PM, Michael Chan <mchan@broadcom.com> wrote:
> On Mon, 2013-12-09 at 13:07 -0800, Michael Chan wrote:
>> On Tue, 2013-12-10 at 00:18 +0300, Sergei Shtylyov wrote:
>> > > We had crashes when the PCI config space got scanned via
>> > > /sys/devices/pcixxxx/....../config.
>> >
>> > > I agree that this fix will not help if the scan happens before the tg3
>> > > driver gets loaded.
>> >
>> > Then perhaps a better place for such fixup would be a PCI quirk?
>> >
>> Yes, I agree. Thanks.
>>
>
> On second thought, I think your original patch should be sufficient and
> we don't need to add the PCI quirk to cover so many devices. The reason
> is that indirect access needs to be explicitly enabled in the
> MISC_HOST_CTRL (0x68) register. The default value for register 0x68
> should have indirect access disabled.
>
> Nat, does this match what you're seeing? Did you ever see any system
> crash before tg3 loads?
>
--
Regards
Nat Gurumoorthy AB6SJ
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-10 15:01 ` Natarajan Gurumoorthy
@ 2013-12-10 18:40 ` Michael Chan
0 siblings, 0 replies; 13+ messages in thread
From: Michael Chan @ 2013-12-10 18:40 UTC (permalink / raw)
To: Natarajan Gurumoorthy, davem
Cc: Sergei Shtylyov, nsujir, netdev, linux-kernel@vger.kernel.org
On Tue, 2013-12-10 at 07:01 -0800, Natarajan Gurumoorthy wrote:
> The only time I see crashes is after the tg3 driver has been
> loaded into the system. I our use case we are poking around
> /sys/devices/pcixxxx/......../config.
David, please apply this first patch.
Acked-by: Michael Chan <mchan@broadcom.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-10 6:58 ` Michael Chan
2013-12-10 15:01 ` Natarajan Gurumoorthy
@ 2013-12-10 18:43 ` David Miller
2013-12-10 18:49 ` Michael Chan
1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2013-12-10 18:43 UTC (permalink / raw)
To: mchan; +Cc: sergei.shtylyov, natg, nsujir, netdev, linux-kernel
From: Michael Chan <mchan@broadcom.com>
Date: Mon, 9 Dec 2013 22:58:44 -0800
> On Mon, 2013-12-09 at 13:07 -0800, Michael Chan wrote:
>> On Tue, 2013-12-10 at 00:18 +0300, Sergei Shtylyov wrote:
>> > > We had crashes when the PCI config space got scanned via
>> > > /sys/devices/pcixxxx/....../config.
>> >
>> > > I agree that this fix will not help if the scan happens before the tg3
>> > > driver gets loaded.
>> >
>> > Then perhaps a better place for such fixup would be a PCI quirk?
>> >
>> Yes, I agree. Thanks.
>>
>
> On second thought, I think your original patch should be sufficient and
> we don't need to add the PCI quirk to cover so many devices. The reason
> is that indirect access needs to be explicitly enabled in the
> MISC_HOST_CTRL (0x68) register. The default value for register 0x68
> should have indirect access disabled.
>
> Nat, does this match what you're seeing? Did you ever see any system
> crash before tg3 loads?
What if the kernel is booted via kexec, and the driver in the kernel
we are kexec'ing from left indirect access enabled in MISC_HOST_CTRL?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-10 18:43 ` David Miller
@ 2013-12-10 18:49 ` Michael Chan
2013-12-11 3:23 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Michael Chan @ 2013-12-10 18:49 UTC (permalink / raw)
To: David Miller; +Cc: sergei.shtylyov, natg, nsujir, netdev, linux-kernel
On Tue, 2013-12-10 at 13:43 -0500, David Miller wrote:
> What if the kernel is booted via kexec, and the driver in the kernel
> we are kexec'ing from left indirect access enabled in MISC_HOST_CTRL?
That should be ok. The driver will only use valid register offsets in
indirect mode during run time, so register 0x78 should point to a valid
register. If indirect mode is never used by the driver, it will point
to zero with this patch. So register 0x78 will be valid (or zero) in
the kexec'ed kernel.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-10 18:49 ` Michael Chan
@ 2013-12-11 3:23 ` David Miller
2013-12-11 23:21 ` Michael Chan
0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2013-12-11 3:23 UTC (permalink / raw)
To: mchan; +Cc: sergei.shtylyov, natg, nsujir, netdev, linux-kernel
From: Michael Chan <mchan@broadcom.com>
Date: Tue, 10 Dec 2013 10:49:39 -0800
> On Tue, 2013-12-10 at 13:43 -0500, David Miller wrote:
>
>> What if the kernel is booted via kexec, and the driver in the kernel
>> we are kexec'ing from left indirect access enabled in MISC_HOST_CTRL?
>
> That should be ok. The driver will only use valid register offsets in
> indirect mode during run time, so register 0x78 should point to a valid
> register. If indirect mode is never used by the driver, it will point
> to zero with this patch. So register 0x78 will be valid (or zero) in
> the kexec'ed kernel.
Ok, that may be true, but I'd like to consider the much larger issue
at hand.
If the indirect mechanism is enabled, some of the offsets that may be
in there might be value, but would be entirely undesirable to be read
because the read has side effects.
What if the interrupt status register is what gets read if the user
scans config space at just the wrong moment, and therefore an
interrupt gets lost?
I understand that the patch we are discussing is a serious improvement
from the current situation, so I will apply it and queue it up for
-stable.
However I think we need to do something reasonable to prevent the
kinds of situations I described above.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-11 3:23 ` David Miller
@ 2013-12-11 23:21 ` Michael Chan
2013-12-12 0:35 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Michael Chan @ 2013-12-11 23:21 UTC (permalink / raw)
To: David Miller; +Cc: sergei.shtylyov, natg, nsujir, netdev, linux-kernel
On Tue, 2013-12-10 at 22:23 -0500, David Miller wrote:
> Ok, that may be true, but I'd like to consider the much larger issue
> at hand.
>
> If the indirect mechanism is enabled, some of the offsets that may be
> in there might be value, but would be entirely undesirable to be read
> because the read has side effects.
>
> What if the interrupt status register is what gets read if the user
> scans config space at just the wrong moment, and therefore an
> interrupt gets lost?
>
I did a quick audit of the driver to see if we are seriously exposed to
this problem. Only very old chips (5700 to 5703) use indirect register
access as workarounds for various issues. I looked at the IRQ path
which uses status block to determine if there is work to do. If status
block contains no new data, we check the PCI_STATE register and the bits
don't get cleared on read. MAC status register which is used to
determine link status does not have bits that are cleared on read. So
we should be ok.
We use the statistics block for counters on these older chips so there
is no conflict. On newer chips, we read counters from registers which
get cleared on read, but we never use indirect methods to read these
counters.
On all new chips, indirect method is not used except to download
firmware (on 57766 only). Userspace read will not interfere with
firmware download.
So it looks to me that we are in good shape. Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
2013-12-11 23:21 ` Michael Chan
@ 2013-12-12 0:35 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2013-12-12 0:35 UTC (permalink / raw)
To: mchan; +Cc: sergei.shtylyov, natg, nsujir, netdev, linux-kernel
From: Michael Chan <mchan@broadcom.com>
Date: Wed, 11 Dec 2013 15:21:18 -0800
> So it looks to me that we are in good shape. Thanks.
Thanks for checking this out, and providing such detailed
analysis.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-12-12 0:35 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-09 18:43 [PATCH] net-tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 Nat Gurumoorthy
2013-12-09 19:00 ` Michael Chan
2013-12-09 19:22 ` Natarajan Gurumoorthy
2013-12-09 21:18 ` Sergei Shtylyov
2013-12-09 21:07 ` Michael Chan
2013-12-10 6:58 ` Michael Chan
2013-12-10 15:01 ` Natarajan Gurumoorthy
2013-12-10 18:40 ` Michael Chan
2013-12-10 18:43 ` David Miller
2013-12-10 18:49 ` Michael Chan
2013-12-11 3:23 ` David Miller
2013-12-11 23:21 ` Michael Chan
2013-12-12 0:35 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).