* [PATCH 1/1] tg3: Avoid Send BD corruption
@ 2008-09-11 5:12 Matt Carlson
2008-09-11 5:46 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Matt Carlson @ 2008-09-11 5:12 UTC (permalink / raw)
To: davem; +Cc: netdev, Michael Chan, andy
On 5761 and 5784 ASIC revision chips, there is a chance that send BDs
can be corrupted when CLKREQ is enabled. This patch turns CLKREQ off
while the driver is in control of the device.
This patch also bumps up the version to 3.95.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 43 ++++++++++++++++++++++++++++++++++---------
drivers/net/tg3.h | 1 +
2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 71d2c5c..a9c751f 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -66,8 +66,8 @@
#define DRV_MODULE_NAME "tg3"
#define PFX DRV_MODULE_NAME ": "
-#define DRV_MODULE_VERSION "3.94"
-#define DRV_MODULE_RELDATE "August 14, 2008"
+#define DRV_MODULE_VERSION "3.95"
+#define DRV_MODULE_RELDATE "September 10, 2008"
#define TG3_DEF_MAC_MODE 0
#define TG3_DEF_RX_MODE 0
@@ -2039,6 +2039,23 @@ static int tg3_set_power_state(struct tg3 *tp, pci_power_t state)
tp->dev->name, state);
return -EINVAL;
}
+
+ /* Restore the CLKREQ setting. */
+ if (tp->tg3_flags3 & TG3_FLG3_CLKREQ_BUG) {
+ int pcie_cap;
+ u16 lnkctl;
+
+ pcie_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_EXP);
+
+ pci_read_config_word(tp->pdev,
+ pcie_cap + PCI_EXP_LNKCTL,
+ &lnkctl);
+ lnkctl |= PCI_EXP_LNKCTL_CLKREQ_EN;
+ pci_write_config_word(tp->pdev,
+ pcie_cap + PCI_EXP_LNKCTL,
+ lnkctl);
+ }
+
misc_host_ctrl = tr32(TG3PCI_MISC_HOST_CTRL);
tw32(TG3PCI_MISC_HOST_CTRL,
misc_host_ctrl | MISC_HOST_CTRL_MASK_PCI_INT);
@@ -12099,18 +12116,26 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
pcie_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_EXP);
if (pcie_cap != 0) {
+ u16 lnkctl;
+
tp->tg3_flags2 |= TG3_FLG2_PCI_EXPRESS;
pcie_set_readrq(tp->pdev, 4096);
- if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5906) {
- u16 lnkctl;
-
- pci_read_config_word(tp->pdev,
- pcie_cap + PCI_EXP_LNKCTL,
- &lnkctl);
- if (lnkctl & PCI_EXP_LNKCTL_CLKREQ_EN)
+ pci_read_config_word(tp->pdev,
+ pcie_cap + PCI_EXP_LNKCTL,
+ &lnkctl);
+ if (lnkctl & PCI_EXP_LNKCTL_CLKREQ_EN) {
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5906)
tp->tg3_flags2 &= ~TG3_FLG2_HW_TSO_2;
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5784 ||
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5761) {
+ tp->tg3_flags3 |= TG3_FLG3_CLKREQ_BUG;
+ lnkctl &= ~PCI_EXP_LNKCTL_CLKREQ_EN;
+ pci_write_config_word(tp->pdev,
+ pcie_cap + PCI_EXP_LNKCTL,
+ lnkctl);
+ }
}
}
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index f5b8cab..49a017d 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2517,6 +2517,7 @@ struct tg3 {
#define TG3_FLG3_RGMII_STD_IBND_DISABLE 0x00000100
#define TG3_FLG3_RGMII_EXT_IBND_RX_EN 0x00000200
#define TG3_FLG3_RGMII_EXT_IBND_TX_EN 0x00000400
+#define TG3_FLG3_CLKREQ_BUG 0x00000800
struct timer_list timer;
u16 timer_counter;
--
1.5.6.4
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 5:12 [PATCH 1/1] tg3: Avoid Send BD corruption Matt Carlson
@ 2008-09-11 5:46 ` David Miller
2008-09-11 16:59 ` Matt Carlson
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2008-09-11 5:46 UTC (permalink / raw)
To: mcarlson; +Cc: netdev, mchan, andy
From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Wed, 10 Sep 2008 22:12:45 -0700
> On 5761 and 5784 ASIC revision chips, there is a chance that send BDs
> can be corrupted when CLKREQ is enabled. This patch turns CLKREQ off
> while the driver is in control of the device.
>
> This patch also bumps up the version to 3.95.
>
> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>
Are users really hitting this or was this discovered by your own
internal testing and verification?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 5:46 ` David Miller
@ 2008-09-11 16:59 ` Matt Carlson
2008-09-11 21:51 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Matt Carlson @ 2008-09-11 16:59 UTC (permalink / raw)
To: David Miller
Cc: Matthew Carlson, netdev@vger.kernel.org, Michael Chan,
andy@greyhouse.net
On Wed, Sep 10, 2008 at 10:46:18PM -0700, David Miller wrote:
> From: "Matt Carlson" <mcarlson@broadcom.com>
> Date: Wed, 10 Sep 2008 22:12:45 -0700
>
> > On 5761 and 5784 ASIC revision chips, there is a chance that send BDs
> > can be corrupted when CLKREQ is enabled. This patch turns CLKREQ off
> > while the driver is in control of the device.
> >
> > This patch also bumps up the version to 3.95.
> >
> > Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
> > Signed-off-by: Michael Chan <mchan@broadcom.com>
>
> Are users really hitting this or was this discovered by your own
> internal testing and verification?
The bug was discovered by our own internal testing, but this bug is
severe enough to warrant submission to the net-2.6 kernel. I'll be
sending a patch to stable as well.
These chips are relatively new so there isn't a whole lot of opportunity
for users to experience the problem yet. As the chips are distributed,
the likelyhood that users will encounter this problem is pretty high.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 16:59 ` Matt Carlson
@ 2008-09-11 21:51 ` David Miller
2008-09-11 22:55 ` Michael Chan
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2008-09-11 21:51 UTC (permalink / raw)
To: mcarlson; +Cc: netdev, mchan, andy
From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Thu, 11 Sep 2008 09:59:13 -0700
> On Wed, Sep 10, 2008 at 10:46:18PM -0700, David Miller wrote:
> > Are users really hitting this or was this discovered by your own
> > internal testing and verification?
>
> The bug was discovered by our own internal testing, but this bug is
> severe enough to warrant submission to the net-2.6 kernel. I'll be
> sending a patch to stable as well.
>
> These chips are relatively new so there isn't a whole lot of opportunity
> for users to experience the problem yet. As the chips are distributed,
> the likelyhood that users will encounter this problem is pretty high.
That doesn't matter. That isn't the criteria for inclusion outside of
the merge window, or into -stable.
You should read postings such as the following one and the other ones
in it's thread so that you can become familiar with the criteria:
http://marc.info/?l=linux-netdev&m=122048757705315&w=2
And by those descriptions, this patch is not appropriate and I'm
therefore not going to try and submit it for 2.6.27 and risk getting
my head chopped off by Linus again in public.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 21:51 ` David Miller
@ 2008-09-11 22:55 ` Michael Chan
2008-09-11 23:01 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Michael Chan @ 2008-09-11 22:55 UTC (permalink / raw)
To: David Miller; +Cc: Matthew Carlson, netdev@vger.kernel.org, andy@greyhouse.net
On Thu, 2008-09-11 at 14:51 -0700, David Miller wrote:
> From: "Matt Carlson" <mcarlson@broadcom.com>
> Date: Thu, 11 Sep 2008 09:59:13 -0700
>
> > On Wed, Sep 10, 2008 at 10:46:18PM -0700, David Miller wrote:
> > > Are users really hitting this or was this discovered by your own
> > > internal testing and verification?
> >
> > The bug was discovered by our own internal testing, but this bug is
> > severe enough to warrant submission to the net-2.6 kernel. I'll be
> > sending a patch to stable as well.
> >
> > These chips are relatively new so there isn't a whole lot of opportunity
> > for users to experience the problem yet. As the chips are distributed,
> > the likelyhood that users will encounter this problem is pretty high.
>
> That doesn't matter. That isn't the criteria for inclusion outside of
> the merge window, or into -stable.
Send BD corruption is quite serious as it can cause tg3_tx() to crash.
We've seen a number of similar crashes over the years caused by
re-ordered IOs and the nr_frags getting modified by HTB.
Since regression/security/oops fixes are allowed, shouldn't this qualify
since it prevents a crash in tg3_tx()?
>
> You should read postings such as the following one and the other ones
> in it's thread so that you can become familiar with the criteria:
>
> http://marc.info/?l=linux-netdev&m=122048757705315&w=2
>
> And by those descriptions, this patch is not appropriate and I'm
> therefore not going to try and submit it for 2.6.27 and risk getting
> my head chopped off by Linus again in public.
>
>
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 22:55 ` Michael Chan
@ 2008-09-11 23:01 ` David Miller
2008-09-11 23:19 ` Michael Chan
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2008-09-11 23:01 UTC (permalink / raw)
To: mchan; +Cc: mcarlson, netdev, andy
From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 11 Sep 2008 15:55:58 -0700
> Send BD corruption is quite serious as it can cause tg3_tx() to crash.
> We've seen a number of similar crashes over the years caused by
> re-ordered IOs and the nr_frags getting modified by HTB.
>
> Since regression/security/oops fixes are allowed, shouldn't this qualify
> since it prevents a crash in tg3_tx()?
It's at best borderline, since tg3_tx() has reordering detection logic
which will reset and recover the card.
Until I see a real user trigger this I'm still leaning towards no.
Otherwise any driver maintainer can give me this spiel to get their
changes installed outside of the merge window, and that subverts the
entire purpose of the rules.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 23:01 ` David Miller
@ 2008-09-11 23:19 ` Michael Chan
2008-09-11 23:30 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Michael Chan @ 2008-09-11 23:19 UTC (permalink / raw)
To: David Miller; +Cc: Matthew Carlson, netdev@vger.kernel.org, andy@greyhouse.net
On Thu, 2008-09-11 at 16:01 -0700, David Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Thu, 11 Sep 2008 15:55:58 -0700
>
> > Send BD corruption is quite serious as it can cause tg3_tx() to crash.
> > We've seen a number of similar crashes over the years caused by
> > re-ordered IOs and the nr_frags getting modified by HTB.
> >
> > Since regression/security/oops fixes are allowed, shouldn't this qualify
> > since it prevents a crash in tg3_tx()?
>
> It's at best borderline, since tg3_tx() has reordering detection logic
> which will reset and recover the card.
The re-ordering detection logic will allow corruption to happen once
only. It will then enable flushing of IOs and if it detects corruption
again, it will hit BUG(). That's why it wasn't able to prevent the
crash caused by HTB.
>
> Until I see a real user trigger this I'm still leaning towards no.
>
> Otherwise any driver maintainer can give me this spiel to get their
> changes installed outside of the merge window, and that subverts the
> entire purpose of the rules.
>
OK, we'll resubmit this for net-next then. But I still think this one
should qualify. We do internal testing to find as many problems as
possible so that real users don't see too many of them.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] tg3: Avoid Send BD corruption
2008-09-11 23:19 ` Michael Chan
@ 2008-09-11 23:30 ` David Miller
0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2008-09-11 23:30 UTC (permalink / raw)
To: mchan; +Cc: mcarlson, netdev, andy
From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 11 Sep 2008 16:19:28 -0700
> OK, we'll resubmit this for net-next then. But I still think this one
> should qualify. We do internal testing to find as many problems as
> possible so that real users don't see too many of them.
If driver authors can do these kinds of things to their heart's
content, and then submit lots of patches fixing all the bugs they
find, where are we able to draw the line?
That's the problem. And right now my (and Linus's) solution is to push
back and say no unless it meets strict criteria.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-09-11 23:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-11 5:12 [PATCH 1/1] tg3: Avoid Send BD corruption Matt Carlson
2008-09-11 5:46 ` David Miller
2008-09-11 16:59 ` Matt Carlson
2008-09-11 21:51 ` David Miller
2008-09-11 22:55 ` Michael Chan
2008-09-11 23:01 ` David Miller
2008-09-11 23:19 ` Michael Chan
2008-09-11 23:30 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).