All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Longchamp <valentin.longchamp@keymile.com>
To: Johannes Thumshirn <johannes.thumshirn@men.de>
Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>
Subject: Re: EDAC PCIe errors when scannning the bus
Date: Thu, 20 Mar 2014 11:44:03 +0100	[thread overview]
Message-ID: <532AC673.5070308@keymile.com> (raw)
In-Reply-To: <20140319155404.GA2045@jtlinux>

Hello Johannes,

On 03/19/2014 04:54 PM, Johannes Thumshirn wrote:
> On Wed, Mar 19, 2014 at 01:46:37PM +0100, Valentin Longchamp wrote:
>> Hello,
>>
>> We have a board that is based on Freescale's P2041 SoC. The boards has 2 PCIe
>> buses with this topology:
>>
>> PCIe 0 <---> PEX8505 switch <---> 4 network devices
>> PCIE 2 <---> FPGA
>>
>> On 3.10.33 + a subset of the Freescale SDK 1.4 patches, both PCIe buses work
>> well and we are able to use the devices on them.
>>
>> For each bus, I however keep getting EDAC PCIe errors at the very first stage of
>> bus enumeration (please see the attached kernel log, with some debug output from
>> arch/powerpc/kernel/pci-common.c and drivers/pci/probe.c) for both buses.
>>
>> My current "understanding" of the situation is such: since PCI_PROBE_NORMAL is
>> used, pcibios_scan_phb() calls pci_scan_child_bus() that does a pci_scan_slot()
>> on the bus for 32 slots. The first pci_scan_slot() is successful and it
>> discovers the P2041's PCIe Controller. All the 31 other pci_scan_slot() calls
>> generate an EDAC PCIe error, that is triggered by the configuration read
>> transaction to read an hypothetical vendor ID of a device on the bus. This is
>> relevant with that is reported by the EDAC error handler (all the 31 are the same):
>>
>>> PCIE error(s) detected
>>> PCIE ERR_DR register: 0x00020000
>>
>> ICCA bit is set: Access to an illegal configuration space from
>> PEX_CONFIG_ADDR/PEX_CONFIG_DATA was detected.
>>
>>> PCIE ERR_CAP_STAT register: 0x80000001
>>
>> To is set: Transaction originated from PEX_CONFIG_ADDR/PEX_CONFIG_DATA.
>>
>>> PCIE ERR_CAP_R0 register: 0x00000800
>>
>> FMT: 0b00, TYPE: 0b00100 (Config read I guess)
>>
>>> PCIE ERR_CAP_R1 register: 0x00000000
>>> PCIE ERR_CAP_R2 register: 0x00000000
>>> PCIE ERR_CAP_R3 register: 0x00000000
>>
>> Afterwards, pci_scan_child_bus() calls pcibios_fixup_bus (that maybe helps ?).
>> From here, since the P2041's PCIe Controller is a bridge, pci_scan_bridge is
>> called for this bus and all the devices are detected without having any
>> configuration transaction causing EDAC errors.
>>
>> Has someone already observed such a behavior ? Why do these initial transaction
>> generate an error ? What would be a possible fix to avoid these transaction
>> errors for these 31 (unneded ?) pci_scan_slot() calls on the initial bus ?
>>
> 
> I've encountered similar problems on a P4080 based design (mine has additional
> machine checks that cause an oops). I haven't solved it yet, so I unfortunately
> can't offer you a fix. But I was told there are some errata workarounds that
> more or less could have an impact on PCIe behavior. Could you show me the output
> of U-Boot's errata command?

Here is the output for the errata command:

> => errata
> Work-around for Erratum CPU-A003999 enabled
> Work-around for Erratum DDR-A003473 enabled
> Work-around for Erratum ESDHC111 enabled
> Work-around for Erratum DDR-A003 enabled
> Work-around for Erratum A004510 enabled
> Work-around for Erratum SRIO-A004034 enabled
> Work-around for Erratum A004849 is not enabled
> Work-around for Erratum A004580 is not enabled
> Work-around for Erratum USB14 enabled

> 
> Especially if the workarounds for A-004580 and A-004849 are in place.
> 

So both are not enabled, I am going to fix that. Surprisingly, A-004580 is not
defined for the P2041 in u-boot even though it is also present in the P2041's
errata sheet, I had to enable it myself.

However, I expect that enabling the workarounds for these 2 Errata are good for
the system but it will not solve the PCIe EDAC problem.

Thank you for the input.

Valentin

WARNING: multiple messages have this Message-ID (diff)
From: Valentin Longchamp <valentin.longchamp@keymile.com>
To: Johannes Thumshirn <johannes.thumshirn@men.de>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: EDAC PCIe errors when scannning the bus
Date: Thu, 20 Mar 2014 11:44:03 +0100	[thread overview]
Message-ID: <532AC673.5070308@keymile.com> (raw)
In-Reply-To: <20140319155404.GA2045@jtlinux>

Hello Johannes,

On 03/19/2014 04:54 PM, Johannes Thumshirn wrote:
> On Wed, Mar 19, 2014 at 01:46:37PM +0100, Valentin Longchamp wrote:
>> Hello,
>>
>> We have a board that is based on Freescale's P2041 SoC. The boards has 2 PCIe
>> buses with this topology:
>>
>> PCIe 0 <---> PEX8505 switch <---> 4 network devices
>> PCIE 2 <---> FPGA
>>
>> On 3.10.33 + a subset of the Freescale SDK 1.4 patches, both PCIe buses work
>> well and we are able to use the devices on them.
>>
>> For each bus, I however keep getting EDAC PCIe errors at the very first stage of
>> bus enumeration (please see the attached kernel log, with some debug output from
>> arch/powerpc/kernel/pci-common.c and drivers/pci/probe.c) for both buses.
>>
>> My current "understanding" of the situation is such: since PCI_PROBE_NORMAL is
>> used, pcibios_scan_phb() calls pci_scan_child_bus() that does a pci_scan_slot()
>> on the bus for 32 slots. The first pci_scan_slot() is successful and it
>> discovers the P2041's PCIe Controller. All the 31 other pci_scan_slot() calls
>> generate an EDAC PCIe error, that is triggered by the configuration read
>> transaction to read an hypothetical vendor ID of a device on the bus. This is
>> relevant with that is reported by the EDAC error handler (all the 31 are the same):
>>
>>> PCIE error(s) detected
>>> PCIE ERR_DR register: 0x00020000
>>
>> ICCA bit is set: Access to an illegal configuration space from
>> PEX_CONFIG_ADDR/PEX_CONFIG_DATA was detected.
>>
>>> PCIE ERR_CAP_STAT register: 0x80000001
>>
>> To is set: Transaction originated from PEX_CONFIG_ADDR/PEX_CONFIG_DATA.
>>
>>> PCIE ERR_CAP_R0 register: 0x00000800
>>
>> FMT: 0b00, TYPE: 0b00100 (Config read I guess)
>>
>>> PCIE ERR_CAP_R1 register: 0x00000000
>>> PCIE ERR_CAP_R2 register: 0x00000000
>>> PCIE ERR_CAP_R3 register: 0x00000000
>>
>> Afterwards, pci_scan_child_bus() calls pcibios_fixup_bus (that maybe helps ?).
>> From here, since the P2041's PCIe Controller is a bridge, pci_scan_bridge is
>> called for this bus and all the devices are detected without having any
>> configuration transaction causing EDAC errors.
>>
>> Has someone already observed such a behavior ? Why do these initial transaction
>> generate an error ? What would be a possible fix to avoid these transaction
>> errors for these 31 (unneded ?) pci_scan_slot() calls on the initial bus ?
>>
> 
> I've encountered similar problems on a P4080 based design (mine has additional
> machine checks that cause an oops). I haven't solved it yet, so I unfortunately
> can't offer you a fix. But I was told there are some errata workarounds that
> more or less could have an impact on PCIe behavior. Could you show me the output
> of U-Boot's errata command?

Here is the output for the errata command:

> => errata
> Work-around for Erratum CPU-A003999 enabled
> Work-around for Erratum DDR-A003473 enabled
> Work-around for Erratum ESDHC111 enabled
> Work-around for Erratum DDR-A003 enabled
> Work-around for Erratum A004510 enabled
> Work-around for Erratum SRIO-A004034 enabled
> Work-around for Erratum A004849 is not enabled
> Work-around for Erratum A004580 is not enabled
> Work-around for Erratum USB14 enabled

> 
> Especially if the workarounds for A-004580 and A-004849 are in place.
> 

So both are not enabled, I am going to fix that. Surprisingly, A-004580 is not
defined for the P2041 in u-boot even though it is also present in the P2041's
errata sheet, I had to enable it myself.

However, I expect that enabling the workarounds for these 2 Errata are good for
the system but it will not solve the PCIe EDAC problem.

Thank you for the input.

Valentin

  reply	other threads:[~2014-03-20 10:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19 12:46 EDAC PCIe errors when scannning the bus Valentin Longchamp
2014-03-19 15:54 ` Johannes Thumshirn
2014-03-19 15:54   ` Johannes Thumshirn
2014-03-20 10:44   ` Valentin Longchamp [this message]
2014-03-20 10:44     ` Valentin Longchamp
2014-03-19 19:58 ` Rajat Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=532AC673.5070308@keymile.com \
    --to=valentin.longchamp@keymile.com \
    --cc=johannes.thumshirn@men.de \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.