linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] PCI: Support to workaround bus level HW issues
@ 2018-04-19 17:36 James Puthukattukaran
  2018-04-19 17:37 ` [PATCH 1/2] PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI switch specific issues prior to accessing newly added endpoint James Puthukattukaran
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: James Puthukattukaran @ 2018-04-19 17:36 UTC (permalink / raw)
  To: Alex Williamson, Sinan Kaya; +Cc: linux-pci@vger.kernel.org

There are bugs in certain PCIe switches that cause access violations
 when an endpoint device is hotplugged. In particular, there's an issue
with
certain IDT switches that trigger a ACS violation when bringing up a newly
plugged PCIe endpoint device. This is a major issue for platforms
designed to
issue a fatal reset in the case of this event.

The first patch provides a framework for intercepting and working around
issues with parent devices to the endpoint being brought up.

The second patch provides the actual patch for the IDT switch issue using
that framework. The ACS feature is disabled in the IDT switch prior to
endpoint
device detection and then re-enabled subsequent to that.

James

---

James Puthukattukaran (2):
  PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI
    switch     specific issues prior to accessing newly added endpoint
  PCI: Implement workaround for the ACS bug in the IDT switch

 drivers/pci/pci.h    |   4 ++
 drivers/pci/probe.c  |  19 +++++++-
 drivers/pci/quirks.c | 130
+++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 152 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI switch specific issues prior to accessing newly added endpoint
  2018-04-19 17:36 [PATCH 0/2] PCI: Support to workaround bus level HW issues James Puthukattukaran
@ 2018-04-19 17:37 ` James Puthukattukaran
  2018-04-19 17:39 ` [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT, switch James Puthukattukaran
  2018-04-19 18:08 ` [PATCH 0/2] PCI: Support to workaround bus level HW issues Alex Williamson
  2 siblings, 0 replies; 7+ messages in thread
From: James Puthukattukaran @ 2018-04-19 17:37 UTC (permalink / raw)
  To: Alex Williamson, Sinan Kaya; +Cc: linux-pci@vger.kernel.org


This patch provides a framework in which it would be possible to implement
bus specific quirks prior to accessing an endpoint device beneath that bus.
The routine, pci_bus_specific_read_dev_vendor_id, can be called prior to
accessing the end point device itself in order to workaround potential
issues
with the parent device (switch). If there is nothing specific to be done for
a particular switch device, it falls through to check for the endpoint
device
i.e pci_bus_generic_read_dev_vendor_id().

Signed-off: James Puthukattukaran <james.puthukattukaran@oracle.com>
---
 drivers/pci/pci.h    |  2 ++
 drivers/pci/probe.c  | 19 ++++++++++++++++++-
 drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 023f7cf..2d06689 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -225,6 +225,8 @@ enum pci_bar_type {
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
+int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
+				u32 *pl, int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 		    struct resource *res, unsigned int reg);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6f..b4d8cbd 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2097,7 +2097,7 @@ static bool pci_bus_wait_crs(struct pci_bus *bus,
int devfn, u32 *l,
 	return true;
 }

-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn,
u32 *l,
 				int timeout)
 {
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
@@ -2113,6 +2113,23 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus
*bus, int devfn, u32 *l,

 	return true;
 }
+
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+					int timeout)
+{
+	int ret;
+
+	/* An opportunity to implement something specific for this device.
+         * For ex, impelement a quirk prior to even accessing the device
+         */
+	ret = pci_bus_specific_read_dev_vendor_id(bus, devfn, l, timeout);
+	if (ret >= 0)
+		return (ret >= 0);
+
+	return(pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout));
+}
+
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);

 /*
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 2990ad1..c637162 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4741,3 +4741,44 @@ static void quirk_gpu_hda(struct pci_dev *hda)
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
+
+
+static const struct pci_bus_specific_quirk{
+	u16 vendor;
+	u16 device;
+	int (*bus_quirk)(struct pci_bus *bus, int devfn, u32 *l, int timeout);
+} pci_bus_specific_quirks[] = {
+	{0}
+};
+
+/*
+ * This routine provides the ability to implement a bus specific quirk
+ * prior to doing config accesses to the endpoint device itself. For
ex, there
+ * could be HW problems with the switch above the endpoint that causes
issues
+ * when accessing the endpoint device. Such workarounds "specific" to the
+ * parent could be implemented prior or subsequent to accesses to the
+ * endpoint itself
+ *
+ */
+int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
u32 *l,
+					int timeout)
+{
+	const struct pci_bus_specific_quirk *i;
+	struct pci_dev *dev;
+
+	if (!bus || !bus->self)
+	        return -ENOTTY;
+
+	dev = bus->self;
+
+	/* Implement any quirks in the "bus" (switch, for ex) that causes
+	 * issues in accessing the endpoint */
+	for (i = pci_bus_specific_quirks; i->bus_quirk; i++) {
+		if ((i->vendor == dev->vendor ||
+			i->vendor == (u16)PCI_ANY_ID) &&
+			(i->device == dev->device ||
+			i->device == (u16)PCI_ANY_ID))
+				return(i->bus_quirk(bus, devfn, l, timeout));
+	}
+	return -ENOTTY;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT, switch
  2018-04-19 17:36 [PATCH 0/2] PCI: Support to workaround bus level HW issues James Puthukattukaran
  2018-04-19 17:37 ` [PATCH 1/2] PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI switch specific issues prior to accessing newly added endpoint James Puthukattukaran
@ 2018-04-19 17:39 ` James Puthukattukaran
  2018-04-19 18:08 ` [PATCH 0/2] PCI: Support to workaround bus level HW issues Alex Williamson
  2 siblings, 0 replies; 7+ messages in thread
From: James Puthukattukaran @ 2018-04-19 17:39 UTC (permalink / raw)
  To: Alex Williamson, Sinan Kaya; +Cc: linux-pci@vger.kernel.org

The IDT switch incorrectly flags an ACS source violation on a read config
request to an end point device on the completion (IDT 89H32H8G3-YC,
errata #36) even though the PCI Express spec states that completions are
never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
Here's
the specific copy of the errata text

"Item #36 - Downstream port applies ACS Source Validation to Completions
Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
that completions are never affected
by ACS Source Validation. However, completions received by a
downstream port of the PCIe switch from a device that has not yet
captured a PCIe bus number are incorrectly dropped by ACS source
validation by the switch downstream port.

Workaround: Issue a CfgWr1 to the downstream device before issuing
the first CfgRd1 to the device.
This allows the downstream device to capture its bus number; ACS
source validation no longer stops
completions from being forwarded by the downstream port. It has been
observed that Microsoft Windows implements this workaround already;
however, some versions of Linux and other operating systems may not. "

The suggested workaround by IDT is to issue a configuration write to the
downstream device before issuing the first config read. This allows the
downstream device to capture its bus number, thus avoiding the ACS
violation on the completion. In order to make sure that the device is ready
for config accesses, we do what is currently done in making config reads
till it succeeds and then do the config write as specified by the errata.
However, to avoid hitting the errata issue when doing config reads, we
disable ACS SV around this process.

The patch does the following -

1. Disable ACS source violation if enabled.
2. Wait for config space access to become available by reading vendor id
3. Do a config write to the end point (errata workaround)
4. Enable ACS source validation (if it was enabled to begin with)

Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
---
 drivers/pci/pci.h    |  2 ++
 drivers/pci/quirks.c | 95
++++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2d06689..e801d8b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -227,6 +227,8 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus,
int devfn, u32 *pl,
 				int crs_timeout);
 int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
 				u32 *pl, int crs_timeout);
+bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn,
+				u32 *pl, int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 		    struct resource *res, unsigned int reg);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index c637162..0faad6f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4743,22 +4743,111 @@ static void quirk_gpu_hda(struct pci_dev *hda)
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);


+/*
+ * The IDT switch incorrectly flags an ACS source violation on a read
config
+ * request to an end point device on the completion (IDT 89H32H8G3-YC,
+ * errata #36) even though the PCI Express spec states that completions are
+ * never affected by ACS source violation (PCI Spec 3.1, Section
6.12.1.1).
+ * Here's * the specific copy of the errata text --
+ *
+ * "Item #36 - Downstream port applies ACS Source Validation to Completions
+ * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
+ * that completions are never affected
+ * by ACS Source Validation. However, completions received by a
+ * downstream port of the PCIe switch from a device that has not yet
+ * captured a PCIe bus number are incorrectly dropped by ACS source
+ * validation by the switch downstream port."
+ *
+ * The suggested workaround by IDT is to issue a configuration write to the
+ * downstream device before issuing the first config read. This allows the
+ * downstream device to capture its bus number, thus avoiding the ACS
+ * violation on the completion. In order to make sure that the device
is ready
+ * for config accesses, we do what is currently done in making config reads
+ * till it succeeds and then do the config write as specified by the
errata.
+ * However, to avoid hitting the errata issue when doing config reads, we
+ * disable ACS SV around this process.
+ */
+static int pci_idt_acs_quirk(struct pci_bus *bus, int devfn, int enable,
+				bool found)
+{
+	int pos;
+	u16 cap;
+	u16 ctrl;
+	int retval;
+	struct pci_dev *dev = bus->self;
+
+
+	/* Write 0 to the devfn device under the PCIE switch (bus->self)
+	 * as part of forcing the devfn number to latch with the device
+	 * below */
+	if (found)
+		pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
+
+
+	/* Enable/disable ACS SV feature (based on enable flag) */
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+	if (!pos)
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
+
+	if (!(cap & PCI_ACS_SV))
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
+
+	retval = !!(ctrl & cap & PCI_ACS_SV);
+	if (enable)
+		ctrl |= (cap & PCI_ACS_SV);
+	else
+		ctrl &= ~(cap & PCI_ACS_SV);
+
+	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
+
+	/* return the previous state of the ACS SV state i.e was SV enabled
+	 * or disabled? */
+	return retval;
+}
+
+static int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l,
+				int timeout)
+{
+	int enable;
+	bool found;
+
+	/* Disable acs for the IDT switch before attempting the intial
+	 * config accesses to the endpoint device. */
+	enable = pci_idt_acs_quirk(bus, devfn, 0, false);
+
+	/* found indicates whether the endpoint device was identified
+	 * as present or not */
+
+	found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
+
+	/* re-enable acs feature for the switch again if it was enabled to
+ 	 * start with */	
+	if (enable > 0)
+		pci_idt_acs_quirk(bus, devfn, enable, found);
+
+	return (found ? 1 : 0);
+}
+
 static const struct pci_bus_specific_quirk{
 	u16 vendor;
 	u16 device;
 	int (*bus_quirk)(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 } pci_bus_specific_quirks[] = {
+	{ PCI_VENDOR_ID_IDT, 0x80b5, pci_idt_bus_quirk},
 	{0}
 };

 /*
  * This routine provides the ability to implement a bus specific quirk
- * prior to doing config accesses to the endpoint device itself. For
ex, there
+ * prior to doing config accesses to the endpoint device itself. For
ex, there
  * could be HW problems with the switch above the endpoint that causes
issues
  * when accessing the endpoint device. Such workarounds "specific" to the
  * parent could be implemented prior or subsequent to accesses to the
  * endpoint itself
- *
  */
 int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
u32 *l,
 					int timeout)
@@ -4767,7 +4856,7 @@ int pci_bus_specific_read_dev_vendor_id(struct
pci_bus *bus, int devfn, u32 *l,
 	struct pci_dev *dev;

 	if (!bus || !bus->self)
-	        return -ENOTTY;
+		return -ENOTTY;

 	dev = bus->self;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] PCI: Support to workaround bus level HW issues
  2018-04-19 17:36 [PATCH 0/2] PCI: Support to workaround bus level HW issues James Puthukattukaran
  2018-04-19 17:37 ` [PATCH 1/2] PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI switch specific issues prior to accessing newly added endpoint James Puthukattukaran
  2018-04-19 17:39 ` [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT, switch James Puthukattukaran
@ 2018-04-19 18:08 ` Alex Williamson
  2018-04-24 14:57   ` James Puthukattukaran
  2 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2018-04-19 18:08 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: Sinan Kaya, linux-pci@vger.kernel.org

On Thu, 19 Apr 2018 13:36:00 -0400
James Puthukattukaran <james.puthukattukaran@oracle.com> wrote:

> There are bugs in certain PCIe switches that cause access violations
>  when an endpoint device is hotplugged. In particular, there's an issue
> with
> certain IDT switches that trigger a ACS violation when bringing up a newly
> plugged PCIe endpoint device. This is a major issue for platforms
> designed to
> issue a fatal reset in the case of this event.
> 
> The first patch provides a framework for intercepting and working around
> issues with parent devices to the endpoint being brought up.
> 
> The second patch provides the actual patch for the IDT switch issue using
> that framework. The ACS feature is disabled in the IDT switch prior to
> endpoint
> device detection and then re-enabled subsequent to that.

I'm happy with the logic of the patch, but:

 - Patches are lined wrapped
 - There are white space issues (space indenting and extra returns)
 - 'return(foo(bar));' has too many parens, using 'return foo(bar);'
 - Comment style, see Documentation/process/coding-style.rst for
   preferred multi-line inline comment style.  (I see mixed comment
   style in quirks.c, but probe.c is pretty consistent)
 - Please don't fixup style issues in patch 1/2 in patch 2/2

scripts/checkpatch.pl can find some of these for you.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT switch
  2018-04-24 14:51 [PATCH 0/2] PCI: SUpport " James Puthukattukaran
@ 2018-04-24 14:54 ` James Puthukattukaran
  2018-04-24 17:50   ` Alex Williamson
  0 siblings, 1 reply; 7+ messages in thread
From: James Puthukattukaran @ 2018-04-24 14:54 UTC (permalink / raw)
  To: Alex Williamson, Sinan Kaya; +Cc: linux-pci@vger.kernel.org

The IDT switch incorrectly flags an ACS source violation on a read config
request to an end point device on the completion (IDT 89H32H8G3-YC,
errata #36) even though the PCI Express spec states that completions are
never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1). Here's
the specific copy of the errata text

"Item #36 - Downstream port applies ACS Source Validation to Completions
Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
that completions are never affected
by ACS Source Validation. However, completions received by a
downstream port of the PCIe switch from a device that has not yet
captured a PCIe bus number are incorrectly dropped by ACS source
validation by the switch downstream port.

Workaround: Issue a CfgWr1 to the downstream device before issuing
the first CfgRd1 to the device.
This allows the downstream device to capture its bus number; ACS
source validation no longer stops
completions from being forwarded by the downstream port. It has been
observed that Microsoft Windows implements this workaround already;
however, some versions of Linux and other operating systems may not. "

The suggested workaround by IDT is to issue a configuration write to the
downstream device before issuing the first config read. This allows the
downstream device to capture its bus number, thus avoiding the ACS
violation on the completion. In order to make sure that the device is ready
for config accesses, we do what is currently done in making config reads
till it succeeds and then do the config write as specified by the errata.
However, to avoid hitting the errata issue when doing config reads, we
disable ACS SV around this process.

The patch does the following -

1. Disable ACS source violation if enabled.
2. Wait for config space access to become available by reading vendor id
3. Do a config write to the end point (errata workaround)
4. Enable ACS source validation (if it was enabled to begin with)

Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
---
 drivers/pci/pci.h    |  2 ++
 drivers/pci/quirks.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 39ea6ee..d0d588d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -227,6 +227,8 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
 				u32 *pl, int crs_timeout);
+bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn,
+				u32 *pl, int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 		    struct resource *res, unsigned int reg);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index c32c5ec..89cd47d 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4742,12 +4742,109 @@ static void quirk_gpu_hda(struct pci_dev *hda)
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
 
+/*
+ * The IDT switch incorrectly flags an ACS source violation on a read config
+ * request to an end point device on the completion (IDT 89H32H8G3-YC,
+ * errata #36) even though the PCI Express spec states that completions are
+ * never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
+ * Here's * the specific copy of the errata text --
+ *
+ * "Item #36 - Downstream port applies ACS Source Validation to Completions
+ * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
+ * that completions are never affected
+ * by ACS Source Validation. However, completions received by a
+ * downstream port of the PCIe switch from a device that has not yet
+ * captured a PCIe bus number are incorrectly dropped by ACS source
+ * validation by the switch downstream port."
+ *
+ * The suggested workaround by IDT is to issue a configuration write to the
+ * downstream device before issuing the first config read. This allows the
+ * downstream device to capture its bus number, thus avoiding the ACS
+ * violation on the completion. In order to make sure that the device is ready
+ * for config accesses, we do what is currently done in making config reads
+ * till it succeeds and then do the config write as specified by the errata.
+ * However, to avoid hitting the errata issue when doing config reads, we
+ * disable ACS SV around this process.
+ */
+static int pci_idt_acs_quirk(struct pci_bus *bus, int devfn, int enable,
+				bool found)
+{
+	int pos;
+	u16 cap;
+	u16 ctrl;
+	int retval;
+	struct pci_dev *dev = bus->self;
+
+
+	/* Write 0 to the devfn device under the PCIE switch (bus->self)
+	 * as part of forcing the devfn number to latch with the device
+	 * below
+	 */
+	if (found)
+		pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
+
+
+	/* Enable/disable ACS SV feature (based on enable flag) */
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+	if (!pos)
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
+
+	if (!(cap & PCI_ACS_SV))
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
+
+	retval = !!(ctrl & cap & PCI_ACS_SV);
+	if (enable)
+		ctrl |= (cap & PCI_ACS_SV);
+	else
+		ctrl &= ~(cap & PCI_ACS_SV);
+
+	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
+
+	/* return the previous state of the ACS SV state i.e was SV enabled
+	 * or disabled?
+	 */
+	return retval;
+}
+
+static int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l,
+				int timeout)
+{
+	int enable;
+	bool found;
+
+	/*
+	 * Disable acs for the IDT switch before attempting the initial
+	 * config accesses to the endpoint device.
+	 */
+	enable = pci_idt_acs_quirk(bus, devfn, 0, false);
+
+	/*
+	 * found indicates whether the endpoint device was identified
+	 * as present or not
+	 */
+
+	found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
+
+	/*
+	 * re-enable acs feature for the switch again if it was enabled to
+	 * start with
+	 */
+	if (enable > 0)
+		pci_idt_acs_quirk(bus, devfn, enable, found);
+
+	return found ? 1 : 0;
+}
 
 static const struct pci_bus_specific_quirk{
 	u16 vendor;
 	u16 device;
 	int (*bus_quirk)(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 } pci_bus_specific_quirks[] = {
+	{ PCI_VENDOR_ID_IDT, 0x80b5, pci_idt_bus_quirk},
 	{0}
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] PCI: Support to workaround bus level HW issues
  2018-04-19 18:08 ` [PATCH 0/2] PCI: Support to workaround bus level HW issues Alex Williamson
@ 2018-04-24 14:57   ` James Puthukattukaran
  0 siblings, 0 replies; 7+ messages in thread
From: James Puthukattukaran @ 2018-04-24 14:57 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Sinan Kaya, linux-pci@vger.kernel.org



On 04/19/2018 02:08 PM, Alex Williamson wrote:
> On Thu, 19 Apr 2018 13:36:00 -0400
> James Puthukattukaran <james.puthukattukaran@oracle.com> wrote:
> 
>> There are bugs in certain PCIe switches that cause access violations
>>  when an endpoint device is hotplugged. In particular, there's an issue
>> with
>> certain IDT switches that trigger a ACS violation when bringing up a newly
>> plugged PCIe endpoint device. This is a major issue for platforms
>> designed to
>> issue a fatal reset in the case of this event.
>>
>> The first patch provides a framework for intercepting and working around
>> issues with parent devices to the endpoint being brought up.
>>
>> The second patch provides the actual patch for the IDT switch issue using
>> that framework. The ACS feature is disabled in the IDT switch prior to
>> endpoint
>> device detection and then re-enabled subsequent to that.
> 
> I'm happy with the logic of the patch, but:
> 
>  - Patches are lined wrapped
>  - There are white space issues (space indenting and extra returns)
>  - 'return(foo(bar));' has too many parens, using 'return foo(bar);'
>  - Comment style, see Documentation/process/coding-style.rst for
>    preferred multi-line inline comment style.  (I see mixed comment
>    style in quirks.c, but probe.c is pretty consistent)
>  - Please don't fixup style issues in patch 1/2 in patch 2/2
> 
> scripts/checkpatch.pl can find some of these for you.  Thanks,
> 
> 

Alex - 
I believe I resolved the issues you pointed out wrt style. I sent out a fresh set of patches.
Again, appreciate your time!
regards
James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT switch
  2018-04-24 14:54 ` [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT switch James Puthukattukaran
@ 2018-04-24 17:50   ` Alex Williamson
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2018-04-24 17:50 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: Sinan Kaya, linux-pci@vger.kernel.org

On Tue, 24 Apr 2018 10:54:26 -0400
James Puthukattukaran <james.puthukattukaran@oracle.com> wrote:

> The IDT switch incorrectly flags an ACS source violation on a read config
> request to an end point device on the completion (IDT 89H32H8G3-YC,
> errata #36) even though the PCI Express spec states that completions are
> never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1). Here's
> the specific copy of the errata text
> 
> "Item #36 - Downstream port applies ACS Source Validation to Completions
> Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
> that completions are never affected
> by ACS Source Validation. However, completions received by a
> downstream port of the PCIe switch from a device that has not yet
> captured a PCIe bus number are incorrectly dropped by ACS source
> validation by the switch downstream port.
> 
> Workaround: Issue a CfgWr1 to the downstream device before issuing
> the first CfgRd1 to the device.
> This allows the downstream device to capture its bus number; ACS
> source validation no longer stops
> completions from being forwarded by the downstream port. It has been
> observed that Microsoft Windows implements this workaround already;
> however, some versions of Linux and other operating systems may not. "
> 
> The suggested workaround by IDT is to issue a configuration write to the
> downstream device before issuing the first config read. This allows the
> downstream device to capture its bus number, thus avoiding the ACS
> violation on the completion. In order to make sure that the device is ready
> for config accesses, we do what is currently done in making config reads
> till it succeeds and then do the config write as specified by the errata.
> However, to avoid hitting the errata issue when doing config reads, we
> disable ACS SV around this process.
> 
> The patch does the following -
> 
> 1. Disable ACS source violation if enabled.
> 2. Wait for config space access to become available by reading vendor id
> 3. Do a config write to the end point (errata workaround)
> 4. Enable ACS source validation (if it was enabled to begin with)
> 
> Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
> ---
>  drivers/pci/pci.h    |  2 ++
>  drivers/pci/quirks.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 99 insertions(+)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 39ea6ee..d0d588d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -227,6 +227,8 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
>  				int crs_timeout);
>  int pci_bus_specific_read_dev_vendor_id(struct pci_bus *bus, int devfn,
>  				u32 *pl, int crs_timeout);
> +bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn,
> +				u32 *pl, int crs_timeout);
>  int pci_setup_device(struct pci_dev *dev);
>  int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
>  		    struct resource *res, unsigned int reg);
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index c32c5ec..89cd47d 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4742,12 +4742,109 @@ static void quirk_gpu_hda(struct pci_dev *hda)
>  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
>  			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
>  
> +/*
> + * The IDT switch incorrectly flags an ACS source violation on a read config
> + * request to an end point device on the completion (IDT 89H32H8G3-YC,
> + * errata #36) even though the PCI Express spec states that completions are
> + * never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1).
> + * Here's * the specific copy of the errata text --
> + *
> + * "Item #36 - Downstream port applies ACS Source Validation to Completions
> + * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
> + * that completions are never affected
> + * by ACS Source Validation. However, completions received by a
> + * downstream port of the PCIe switch from a device that has not yet
> + * captured a PCIe bus number are incorrectly dropped by ACS source
> + * validation by the switch downstream port."
> + *
> + * The suggested workaround by IDT is to issue a configuration write to the
> + * downstream device before issuing the first config read. This allows the
> + * downstream device to capture its bus number, thus avoiding the ACS
> + * violation on the completion. In order to make sure that the device is ready
> + * for config accesses, we do what is currently done in making config reads
> + * till it succeeds and then do the config write as specified by the errata.
> + * However, to avoid hitting the errata issue when doing config reads, we
> + * disable ACS SV around this process.
> + */
> +static int pci_idt_acs_quirk(struct pci_bus *bus, int devfn, int enable,
> +				bool found)
> +{
> +	int pos;
> +	u16 cap;
> +	u16 ctrl;
> +	int retval;
> +	struct pci_dev *dev = bus->self;
> +
> +
> +	/* Write 0 to the devfn device under the PCIE switch (bus->self)
> +	 * as part of forcing the devfn number to latch with the device
> +	 * below
> +	 */
> +	if (found)
> +		pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
> +
> +
> +	/* Enable/disable ACS SV feature (based on enable flag) */
> +	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
> +	if (!pos)
> +		return -ENODEV;
> +
> +	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
> +
> +	if (!(cap & PCI_ACS_SV))
> +		return -ENODEV;
> +
> +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
> +
> +	retval = !!(ctrl & cap & PCI_ACS_SV);
> +	if (enable)
> +		ctrl |= (cap & PCI_ACS_SV);
> +	else
> +		ctrl &= ~(cap & PCI_ACS_SV);
> +
> +	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> +
> +	/* return the previous state of the ACS SV state i.e was SV enabled
> +	 * or disabled?
> +	 */
> +	return retval;
> +}
> +
> +static int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l,
> +				int timeout)
> +{
> +	int enable;
> +	bool found;
> +
> +	/*
> +	 * Disable acs for the IDT switch before attempting the initial
> +	 * config accesses to the endpoint device.
> +	 */
> +	enable = pci_idt_acs_quirk(bus, devfn, 0, false);
> +
> +	/*
> +	 * found indicates whether the endpoint device was identified
> +	 * as present or not
> +	 */
> +
> +	found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
> +
> +	/*
> +	 * re-enable acs feature for the switch again if it was enabled to
> +	 * start with
> +	 */


Inconsistent comment style even within the same patch.  Otherwise,

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>


> +	if (enable > 0)
> +		pci_idt_acs_quirk(bus, devfn, enable, found);
> +
> +	return found ? 1 : 0;
> +}
>  
>  static const struct pci_bus_specific_quirk{
>  	u16 vendor;
>  	u16 device;
>  	int (*bus_quirk)(struct pci_bus *bus, int devfn, u32 *l, int timeout);
>  } pci_bus_specific_quirks[] = {
> +	{ PCI_VENDOR_ID_IDT, 0x80b5, pci_idt_bus_quirk},
>  	{0}
>  };
>  

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-04-24 17:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-19 17:36 [PATCH 0/2] PCI: Support to workaround bus level HW issues James Puthukattukaran
2018-04-19 17:37 ` [PATCH 1/2] PCI: Add pci_bus_specific_read_dev_vendor_id() to workaround PCI switch specific issues prior to accessing newly added endpoint James Puthukattukaran
2018-04-19 17:39 ` [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT, switch James Puthukattukaran
2018-04-19 18:08 ` [PATCH 0/2] PCI: Support to workaround bus level HW issues Alex Williamson
2018-04-24 14:57   ` James Puthukattukaran
  -- strict thread matches above, loose matches on Subject: below --
2018-04-24 14:51 [PATCH 0/2] PCI: SUpport " James Puthukattukaran
2018-04-24 14:54 ` [PATCH 2/2] PCI: Implement workaround for the ACS bug in the IDT switch James Puthukattukaran
2018-04-24 17:50   ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).