All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: "Pali Rohár" <pali@kernel.org>
Cc: Jeremy Linton <jeremy.linton@arm.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, lorenzo.pieralisi@arm.com,
	nsaenz@kernel.org, bhelgaas@google.com, rjw@rjwysocki.net,
	lenb@kernel.org, robh@kernel.org, kw@linux.com,
	bcm-kernel-feedback-list@broadcom.com,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/4] PCI: brcmstb: Add ACPI config space quirk
Date: Fri, 22 Oct 2021 10:29:48 -0700	[thread overview]
Message-ID: <3a956549-3304-5a4c-3058-eccfac44d31b@gmail.com> (raw)
In-Reply-To: <20211022171728.vlxb3sfebfpgijmp@pali>

On 10/22/21 10:17 AM, Pali Rohár wrote:
> On Friday 22 October 2021 10:04:36 Florian Fainelli wrote:
>> On 10/5/21 7:07 PM, Florian Fainelli wrote:
>>>
>>>
>>> On 10/5/2021 3:25 PM, Jeremy Linton wrote:
>>>> Hi,
>>>>
>>>> On 10/5/21 2:43 PM, Pali Rohár wrote:
>>>>> Hello!
>>>>>
>>>>> On Tuesday 05 October 2021 10:57:18 Jeremy Linton wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/5/21 10:32 AM, Bjorn Helgaas wrote:
>>>>>>> On Thu, Aug 26, 2021 at 02:15:55AM -0500, Jeremy Linton wrote:
>>>>>>>> Additionally, some basic bus/device filtering exist to avoid sending
>>>>>>>> config transactions to invalid devices on the RP's primary or
>>>>>>>> secondary bus. A basic link check is also made to assure that
>>>>>>>> something is operational on the secondary side before probing the
>>>>>>>> remainder of the config space. If either of these constraints are
>>>>>>>> violated and a config operation is lost in the ether because an EP
>>>>>>>> doesn't respond an unrecoverable SERROR is raised.
>>>>>>>
>>>>>>> It's not "lost"; I assume the root port raises an error because it
>>>>>>> can't send a transaction over a link that is down.
>>>>>>
>>>>>> The problem is AFAIK because the root port doesn't do that.
>>>>>
>>>>> Interesting! Does it mean that PCIe Root Complex / Host Bridge (which I
>>>>> guess contains also logic for Root Port) does not signal transaction
>>>>> failure for config requests? Or it is just your opinion? Because I'm
>>>>> dealing with similar issues and I'm trying to find a way how to detect
>>>>> if some PCIe IP signal transaction error via AXI SLVERR response OR it
>>>>> just does not send any response back. So if you know some way how to
>>>>> check which one it is, I would like to know it too.
>>>>
>>>> This is my _opinion_ based on what I've heard of some other IP
>>>> integration issues, and what i've seen poking at this one from the
>>>> perspective of a SW guy rather than a HW guy. So, basically worthless.
>>>> But, you should consider that most of these cores/interconnects aren't
>>>> aware of PCIe completion semantics so its the root ports
>>>> responsibility to say, gracefully translate a non-posted write that
>>>> doesn't have a completion for the interconnects its attached to,
>>>> rather than tripping something generic like a SLVERR.
>>>>
>>>> Anyway, for this I would poke around the pile of exception registers,
>>>> with your specific processors manual handy because a lot of them are
>>>> implementation defined.
>>>
>>> I should be able to get you an answer in the new few days whether
>>> configuration space requests also generate an error towards the ARM CPU,
>>> since memory space requests most definitively do.
>>
>> Did not get an answer from the design team, but going through our bug
>> tracker, there were evidences of configuration space accesses also
>> generating external aborts:
>>
>> [    8.988237] Unhandled fault: synchronous external abort (0x96000210) at 0xffffff8009539004
>> [    9.026698] PC is at pci_generic_config_read32+0x30/0xb0
> 
> So this is error caused by reading from config space.
> 
> Can you check if also writing to config space can trigger some crash? If
> yes, I would like to know if write would be also synchronous or rather
> asynchronous abort.

Yes it does and AFAICT it always shows up as a system error interrupt,
here is an example:

# setpci -d *:* latency_timer=40
[   25.909644] SError Interrupt on CPU2, code 0xbf000002 -- SError
[   25.909647] CPU: 2 PID: 1676 Comm: setpci Not tainted
5.10.70-0.2pre-ge3872e15011b #2
[   25.909649] Hardware name: BCM972165SV_V10 (DT)
[   25.909651] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   25.909652] pc : pci_user_write_config_byte+0x6c/0x78
[   25.909654] lr : pci_user_write_config_byte+0x68/0x78
[   25.909655] sp : ffffffc015853c20
[   25.909656] x29: ffffffc015853c20 x28: ffffff8003053000
[   25.909661] x27: 0000000000000000 x26: 0000000000000000
[   25.909664] x25: 0000000000000001 x24: ffffff8004a23780
[   25.909668] x23: ffffff80049aa000 x22: ffffffc015853d68
[   25.909671] x21: 0000000000000040 x20: 000000000000000d
[   25.909674] x19: 000000000000000e x18: 0000000000000000
[   25.909677] x17: 0000000000000000 x16: 0000000000000000
[   25.909680] x15: 0000000000000000 x14: 0000000000000000
[   25.909684] x13: 0000000000000000 x12: 0000000000000000
[   25.909687] x11: 0000000000000000 x10: 0000000000000000
[   25.909690] x9 : ffffffc010483214 x8 : 0000000000000000
[   25.909693] x7 : ffffff800498df00 x6 : ffffff80049a8380
[   25.909696] x5 : ffffffc015510000 x4 : ffffff80049a9800
[   25.909699] x3 : 0000000000000000 x2 : 000000000000000d
[   25.909702] x1 : 0000000000000000 x0 : 0000000000000000
[   25.909706] Kernel panic - not syncing: Asynchronous SError Interrupt
[   25.909708] CPU: 2 PID: 1676 Comm: setpci Not tainted
5.10.70-0.2pre-ge3872e15011b #2
[   25.909710] Hardware name: BCM972165SV_V10 (DT)
[   25.909711] Call trace:
[   25.909712]  dump_backtrace+0x0/0x1d0
[   25.909713]  show_stack+0x1c/0x24
[   25.909714]  dump_stack+0xd0/0x12c
[   25.909716]  panic+0x128/0x308
[   25.909717]  nmi_panic+0x50/0x70
[   25.909718]  arm64_serror_panic+0x74/0x80
[   25.909720]  do_serror+0x28/0x60
[   25.909721]  el1_error+0x8c/0x10c
[   25.909722]  pci_user_write_config_byte+0x6c/0x78
[   25.909724]  pci_write_config+0x7c/0x1a0
[   25.909725]  sysfs_kf_bin_write+0x64/0x84
[   25.909727]  kernfs_fop_write_iter+0xbc/0x170
[   25.909728]  new_sync_write+0x80/0xcc
[   25.909729]  vfs_write+0xec/0x110
[   25.909730]  ksys_pwrite64+0x50/0x8c
[   25.909732]  __arm64_sys_pwrite64+0x20/0x28
[   25.909733]  el0_svc_common.constprop.4+0x100/0x184
[   25.909735]  do_el0_svc+0x38/0x78
[   25.909736]  el0_svc+0x1c/0x28
[   25.909737]  el0_sync_handler+0x64/0x12c
[   25.909738]  el0_sync+0x148/0x180
[   25.909775] brcm-pcie 8b20000.pcie: Error: CFG Acc, 32bit, Write,
Bus=1, Dev=0, Fun=0, Reg=0xc, lanes=01000000
[   26.136082] brcm-pcie 8b20000.pcie:  Type: TO=0 Abt=0 UnsupReq=0
AccTO=0 AccDsbld=1 Acc64bit=0
[   26.144709] SMP: stopping secondary CPUs
[   26.144711] Kernel Offset: disabled
[   26.144712] CPU features: 0x0040002,24002004
[   26.144713] Memory Limit: none

-- 
Florian

WARNING: multiple messages have this Message-ID (diff)
From: Florian Fainelli <f.fainelli@gmail.com>
To: "Pali Rohár" <pali@kernel.org>
Cc: Jeremy Linton <jeremy.linton@arm.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, lorenzo.pieralisi@arm.com,
	nsaenz@kernel.org, bhelgaas@google.com, rjw@rjwysocki.net,
	lenb@kernel.org, robh@kernel.org, kw@linux.com,
	bcm-kernel-feedback-list@broadcom.com,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/4] PCI: brcmstb: Add ACPI config space quirk
Date: Fri, 22 Oct 2021 10:29:48 -0700	[thread overview]
Message-ID: <3a956549-3304-5a4c-3058-eccfac44d31b@gmail.com> (raw)
In-Reply-To: <20211022171728.vlxb3sfebfpgijmp@pali>

On 10/22/21 10:17 AM, Pali Rohár wrote:
> On Friday 22 October 2021 10:04:36 Florian Fainelli wrote:
>> On 10/5/21 7:07 PM, Florian Fainelli wrote:
>>>
>>>
>>> On 10/5/2021 3:25 PM, Jeremy Linton wrote:
>>>> Hi,
>>>>
>>>> On 10/5/21 2:43 PM, Pali Rohár wrote:
>>>>> Hello!
>>>>>
>>>>> On Tuesday 05 October 2021 10:57:18 Jeremy Linton wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/5/21 10:32 AM, Bjorn Helgaas wrote:
>>>>>>> On Thu, Aug 26, 2021 at 02:15:55AM -0500, Jeremy Linton wrote:
>>>>>>>> Additionally, some basic bus/device filtering exist to avoid sending
>>>>>>>> config transactions to invalid devices on the RP's primary or
>>>>>>>> secondary bus. A basic link check is also made to assure that
>>>>>>>> something is operational on the secondary side before probing the
>>>>>>>> remainder of the config space. If either of these constraints are
>>>>>>>> violated and a config operation is lost in the ether because an EP
>>>>>>>> doesn't respond an unrecoverable SERROR is raised.
>>>>>>>
>>>>>>> It's not "lost"; I assume the root port raises an error because it
>>>>>>> can't send a transaction over a link that is down.
>>>>>>
>>>>>> The problem is AFAIK because the root port doesn't do that.
>>>>>
>>>>> Interesting! Does it mean that PCIe Root Complex / Host Bridge (which I
>>>>> guess contains also logic for Root Port) does not signal transaction
>>>>> failure for config requests? Or it is just your opinion? Because I'm
>>>>> dealing with similar issues and I'm trying to find a way how to detect
>>>>> if some PCIe IP signal transaction error via AXI SLVERR response OR it
>>>>> just does not send any response back. So if you know some way how to
>>>>> check which one it is, I would like to know it too.
>>>>
>>>> This is my _opinion_ based on what I've heard of some other IP
>>>> integration issues, and what i've seen poking at this one from the
>>>> perspective of a SW guy rather than a HW guy. So, basically worthless.
>>>> But, you should consider that most of these cores/interconnects aren't
>>>> aware of PCIe completion semantics so its the root ports
>>>> responsibility to say, gracefully translate a non-posted write that
>>>> doesn't have a completion for the interconnects its attached to,
>>>> rather than tripping something generic like a SLVERR.
>>>>
>>>> Anyway, for this I would poke around the pile of exception registers,
>>>> with your specific processors manual handy because a lot of them are
>>>> implementation defined.
>>>
>>> I should be able to get you an answer in the new few days whether
>>> configuration space requests also generate an error towards the ARM CPU,
>>> since memory space requests most definitively do.
>>
>> Did not get an answer from the design team, but going through our bug
>> tracker, there were evidences of configuration space accesses also
>> generating external aborts:
>>
>> [    8.988237] Unhandled fault: synchronous external abort (0x96000210) at 0xffffff8009539004
>> [    9.026698] PC is at pci_generic_config_read32+0x30/0xb0
> 
> So this is error caused by reading from config space.
> 
> Can you check if also writing to config space can trigger some crash? If
> yes, I would like to know if write would be also synchronous or rather
> asynchronous abort.

Yes it does and AFAICT it always shows up as a system error interrupt,
here is an example:

# setpci -d *:* latency_timer=40
[   25.909644] SError Interrupt on CPU2, code 0xbf000002 -- SError
[   25.909647] CPU: 2 PID: 1676 Comm: setpci Not tainted
5.10.70-0.2pre-ge3872e15011b #2
[   25.909649] Hardware name: BCM972165SV_V10 (DT)
[   25.909651] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   25.909652] pc : pci_user_write_config_byte+0x6c/0x78
[   25.909654] lr : pci_user_write_config_byte+0x68/0x78
[   25.909655] sp : ffffffc015853c20
[   25.909656] x29: ffffffc015853c20 x28: ffffff8003053000
[   25.909661] x27: 0000000000000000 x26: 0000000000000000
[   25.909664] x25: 0000000000000001 x24: ffffff8004a23780
[   25.909668] x23: ffffff80049aa000 x22: ffffffc015853d68
[   25.909671] x21: 0000000000000040 x20: 000000000000000d
[   25.909674] x19: 000000000000000e x18: 0000000000000000
[   25.909677] x17: 0000000000000000 x16: 0000000000000000
[   25.909680] x15: 0000000000000000 x14: 0000000000000000
[   25.909684] x13: 0000000000000000 x12: 0000000000000000
[   25.909687] x11: 0000000000000000 x10: 0000000000000000
[   25.909690] x9 : ffffffc010483214 x8 : 0000000000000000
[   25.909693] x7 : ffffff800498df00 x6 : ffffff80049a8380
[   25.909696] x5 : ffffffc015510000 x4 : ffffff80049a9800
[   25.909699] x3 : 0000000000000000 x2 : 000000000000000d
[   25.909702] x1 : 0000000000000000 x0 : 0000000000000000
[   25.909706] Kernel panic - not syncing: Asynchronous SError Interrupt
[   25.909708] CPU: 2 PID: 1676 Comm: setpci Not tainted
5.10.70-0.2pre-ge3872e15011b #2
[   25.909710] Hardware name: BCM972165SV_V10 (DT)
[   25.909711] Call trace:
[   25.909712]  dump_backtrace+0x0/0x1d0
[   25.909713]  show_stack+0x1c/0x24
[   25.909714]  dump_stack+0xd0/0x12c
[   25.909716]  panic+0x128/0x308
[   25.909717]  nmi_panic+0x50/0x70
[   25.909718]  arm64_serror_panic+0x74/0x80
[   25.909720]  do_serror+0x28/0x60
[   25.909721]  el1_error+0x8c/0x10c
[   25.909722]  pci_user_write_config_byte+0x6c/0x78
[   25.909724]  pci_write_config+0x7c/0x1a0
[   25.909725]  sysfs_kf_bin_write+0x64/0x84
[   25.909727]  kernfs_fop_write_iter+0xbc/0x170
[   25.909728]  new_sync_write+0x80/0xcc
[   25.909729]  vfs_write+0xec/0x110
[   25.909730]  ksys_pwrite64+0x50/0x8c
[   25.909732]  __arm64_sys_pwrite64+0x20/0x28
[   25.909733]  el0_svc_common.constprop.4+0x100/0x184
[   25.909735]  do_el0_svc+0x38/0x78
[   25.909736]  el0_svc+0x1c/0x28
[   25.909737]  el0_sync_handler+0x64/0x12c
[   25.909738]  el0_sync+0x148/0x180
[   25.909775] brcm-pcie 8b20000.pcie: Error: CFG Acc, 32bit, Write,
Bus=1, Dev=0, Fun=0, Reg=0xc, lanes=01000000
[   26.136082] brcm-pcie 8b20000.pcie:  Type: TO=0 Abt=0 UnsupReq=0
AccTO=0 AccDsbld=1 Acc64bit=0
[   26.144709] SMP: stopping secondary CPUs
[   26.144711] Kernel Offset: disabled
[   26.144712] CPU features: 0x0040002,24002004
[   26.144713] Memory Limit: none

-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-10-22 17:29 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26  7:15 [PATCH v3 0/4] CM4 ACPI PCIe quirk Jeremy Linton
2021-08-26  7:15 ` Jeremy Linton
2021-08-26  7:15 ` [PATCH v3 1/4] PCI: brcmstb: Break register definitions into separate header Jeremy Linton
2021-08-26  7:15   ` Jeremy Linton
2021-08-30  8:37   ` nicolas saenz julienne
2021-08-30  8:37     ` nicolas saenz julienne
2021-08-26  7:15 ` [PATCH v3 2/4] PCI: brcmstb: Add ACPI config space quirk Jeremy Linton
2021-08-26  7:15   ` Jeremy Linton
2021-08-30  8:36   ` nicolas saenz julienne
2021-08-30  8:36     ` nicolas saenz julienne
2021-08-30 16:23     ` Jeremy Linton
2021-08-30 16:23       ` Jeremy Linton
2021-08-30 16:27       ` Florian Fainelli
2021-08-30 16:27         ` Florian Fainelli
2021-08-30 17:17         ` nicolas saenz julienne
2021-08-30 17:17           ` nicolas saenz julienne
2021-10-05 15:32   ` Bjorn Helgaas
2021-10-05 15:32     ` Bjorn Helgaas
2021-10-05 15:57     ` Jeremy Linton
2021-10-05 15:57       ` Jeremy Linton
2021-10-05 19:43       ` Pali Rohár
2021-10-05 19:43         ` Pali Rohár
2021-10-05 22:25         ` Jeremy Linton
2021-10-05 22:25           ` Jeremy Linton
2021-10-06  2:07           ` Florian Fainelli
2021-10-06  2:07             ` Florian Fainelli
2021-10-22 17:04             ` Florian Fainelli
2021-10-22 17:04               ` Florian Fainelli
2021-10-22 17:17               ` Pali Rohár
2021-10-22 17:17                 ` Pali Rohár
2021-10-22 17:29                 ` Florian Fainelli [this message]
2021-10-22 17:29                   ` Florian Fainelli
2021-10-22 17:57                   ` Pali Rohár
2021-10-22 17:57                     ` Pali Rohár
2021-08-26  7:15 ` [PATCH v3 3/4] PCI/ACPI: Add Broadcom bcm2711 MCFG quirk Jeremy Linton
2021-08-26  7:15   ` Jeremy Linton
2021-08-30  8:37   ` nicolas saenz julienne
2021-08-30  8:37     ` nicolas saenz julienne
2021-09-13 16:12   ` Rafael J. Wysocki
2021-09-13 16:12     ` Rafael J. Wysocki
2021-10-05 15:10   ` Bjorn Helgaas
2021-10-05 15:10     ` Bjorn Helgaas
2021-10-05 15:43     ` Jeremy Linton
2021-10-05 15:43       ` Jeremy Linton
2021-10-05 22:31       ` Bjorn Helgaas
2021-10-05 22:31         ` Bjorn Helgaas
2021-10-05 23:32         ` Jeremy Linton
2021-10-05 23:32           ` Jeremy Linton
2021-10-05 20:02   ` Pali Rohár
2021-10-05 20:02     ` Pali Rohár
2021-10-05 22:44     ` Jeremy Linton
2021-10-05 22:44       ` Jeremy Linton
2021-08-26  7:15 ` [PATCH v3 4/4] MAINTAINERS: Widen brcmstb PCIe file scope Jeremy Linton
2021-08-26  7:15   ` Jeremy Linton
2021-08-30  8:38   ` nicolas saenz julienne
2021-08-30  8:38     ` nicolas saenz julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a956549-3304-5a4c-3058-eccfac44d31b@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=jeremy.linton@arm.com \
    --cc=kw@linux.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rpi-kernel@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=nsaenz@kernel.org \
    --cc=pali@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.