From: Haakon Bugge <haakon.bugge@oracle.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Niklas Schnelle <schnelle@linux.ibm.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Oliver O'Halloran <oohall@gmail.com>,
Greg Kroah-Hartman <gregkh@suse.de>,
Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
Alex Williamson <alex@shazbot.org>,
Johannes Thumshirn <morbidrsa@gmail.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v3 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits
Date: Wed, 28 Jan 2026 17:19:34 +0000 [thread overview]
Message-ID: <5B14DFEB-18F2-46C0-8DD1-166A3BC275B4@oracle.com> (raw)
In-Reply-To: <20260127222442.GA379147@bhelgaas>
> On Thu, Jan 22, 2026 at 02:09:54PM +0100, Håkon Bugge wrote:
>> program_hpx_type2() is today unconditionally called, despite the fact
>> that when the _HPX was added to the ACPI spec. v3.0, the description
>> stated:
>>
>> OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM
>> is not controlling the PCI Express Advanced Error Reporting
>> capability.
>>
>> Hence, we only call program_hpx_type2() when the OSPM owns the PCIe
>> hotplug capability but not the AER.
>>
>> The Advanced Configuration and Power Interface (ACPI) Specification
>> version 6.6 has a provision that gives the OSPM the ability to control
>> the other PCIe Device Control bits any way. In a note in section
>> 6.2.9, it is stated:
>>
>> "OSPM may override the settings provided by the _HPX object's Type2
>> record (PCI Express Settings) or Type3 record (PCI Express Descriptor
>> Settings) when OSPM has assumed native control of the corresponding
>> feature."
>>
>> So, in order to preserve the non-AER bits in PCIe Device Control, in
>> particular the performance sensitive ExtTag and RO, we make sure
>> program_hpx_type2() if called, doesn't modify any non-AER bits.
>>
>> Also, when program_hpx_type2() is called, we completely avoid
>> modifying any bits in the Link Control register. However, if the _HPX
>> type 2 records contains bits indicating such modifications, we print
>> an info message.
>>
>> [1] Operating System-directed configuration and Power Management
>
> I looked at the specs again and pulled out some more details. Here's
> what seemed relevant to me:
>
> PCI/ACPI: Restrict program_hpx_type2() to AER bits
>
> Previously program_hpx_type2() applied PCIe settings unconditionally, which
> could incorrectly change bits like Extended Tag Field Enable and Enable
> Relaxed Ordering.
>
> When _HPX was added to ACPI r3.0, the intent of the PCIe Setting Record
> (Type 2) in sec 6.2.7.3 was to configure AER registers when the OS does not
> own the AER Capability:
>
> The PCI Express setting record contains ... [the AER] Uncorrectable Error
> Mask, Uncorrectable Error Severity, Correctable Error Mask ... to be used
> when configuring registers in the Advanced Error Reporting Extended
> Capability Structure ...
>
> OSPM will only evaluate _HPX with Setting Record – Type 2 if OSPM is not
> controlling the PCI Express Advanced Error Reporting capability.
>
> ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers in
> the PCIe Capability with AER-related bits, and the restriction that the OS
> use this only when it owns PCIe native hotplug:
>
> ... when configuring PCI Express registers in the Advanced Error
> Reporting Extended Capability Structure *or PCI Express Capability
> Structure* ...
>
> An OS that has assumed ownership of native hot plug but does not ... have
> ownership of the AER register set must use ... the Type 2 record to
> program the AER registers ...
>
> However, since the Type 2 record also includes register bits that have
> functions other than AER, the OS must ignore values ... that are not
> applicable.
>
> Restrict program_hpx_type2() to only the intended purpose:
>
> - Apply settings only when OS owns PCIe native hotplug but not AER,
>
> - Only touch the AER-related bits (Error Reporting Enables) in Device
> Control
>
> - Don't touch Link Control at all, since nothing there seems AER-related,
> but log _HPX settings for debugging purposes
>
> Note that Read Completion Boundary is now configured elsewhere, since it is
> unrelated to _HPX.
Thanks for the word-smithing and improved accuracy!
>> + /* Log if _HPX attempts to modify PCIe Link Control register */
>> if (pcie_cap_has_lnkctl(dev)) {
>> -
>> - /*
>> - * If the Root Port supports Read Completion Boundary of
>> - * 128, set RCB to 128. Otherwise, clear it.
>> - */
>> - hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB;
>> - hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB;
>> - if (pcie_root_rcb_set(dev))
>> - hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB;
>> -
>> - pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL,
>> - ~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or);
This was what confused me a lot, the bit-wise NOT above. That must be wrong, as pcie_capability_clear_and_set_word() inverts the "clear" argument.
>> + if (hpx->pci_exp_lnkctl_and)
>> + pci_info(dev,
>> + "_HPX attempts to reset the following bits in PCIe Link Control: 0x%04x\n",
>> + hpx->pci_exp_lnkctl_and);
>> + if (hpx->pci_exp_lnkctl_or)
>> + pci_info(dev,
>> + "_HPX attempts to set the following bits in PCIe Link Control: 0x%04x\n",
>> + hpx->pci_exp_lnkctl_or);
>
> Again I'm afraid I suggested some over-engineering, and even worse, I
> misinterpreted the pci_exp_lnkctl_and and pci_exp_lnkctl_or usage when
> I said "if pci_exp_lnkctl_and or pci_exp_lnkctl_or are *non-zero*".
>
> Per spec, we are supposed to AND the register with pci_exp_lnkctl_and,
> so the interesting value would be anything other than 0xffff. Since
> we OR it with pci_exp_lnkctl_or, the interesting values there would be
> anything non-zero. So I think what we would want is something like
> this:
>
> + /* Log if _HPX attempts to modify PCIe Link Control register */
> if (pcie_cap_has_lnkctl(dev)) {
> -
> - /*
> - * If the Root Port supports Read Completion Boundary of
> - * 128, set RCB to 128. Otherwise, clear it.
> - */
> - hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB;
> - hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB;
> - if (pcie_root_rcb_set(dev))
> - hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB;
> -
> - pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL,
> - ~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or);
> + if (hpx->pci_exp_lnkctl_and != 0xffff ||
> + hpx->pci_exp_lnkctl_or != 0)
> + pci_info(dev, "_HPX attempts Link Control setting (AND %#06x OR %#06x)\n",
> + hpx->pci_exp_lnkctl_and,
> + hpx->pci_exp_lnkctl_or);
> }
Since we do not want to fix incorrect firmware in this respect, the above makes perfect sense to me. A v4 is on its way.
Thxs, Håkon
next prev parent reply other threads:[~2026-01-28 22:15 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260122130957.68757-1-haakon.bugge@oracle.com>
2026-01-22 13:09 ` [PATCH v3 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits Håkon Bugge
2026-01-27 22:24 ` Bjorn Helgaas
2026-01-28 17:19 ` Haakon Bugge [this message]
2026-01-29 16:36 ` Haakon Bugge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5B14DFEB-18F2-46C0-8DD1-166A3BC275B4@oracle.com \
--to=haakon.bugge@oracle.com \
--cc=alex@shazbot.org \
--cc=bhelgaas@google.com \
--cc=gregkh@suse.de \
--cc=helgaas@kernel.org \
--cc=kaneshige.kenji@jp.fujitsu.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=morbidrsa@gmail.com \
--cc=oohall@gmail.com \
--cc=rafael@kernel.org \
--cc=schnelle@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox