public inbox for linuxppc-dev@ozlabs.org
 help / color / mirror / Atom feed
From: Haakon Bugge <haakon.bugge@oracle.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver O'Halloran <oohall@gmail.com>,
	Greg Kroah-Hartman <gregkh@suse.de>,
	Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
	Alex Williamson <alex@shazbot.org>,
	Johannes Thumshirn <morbidrsa@gmail.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v3 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits
Date: Wed, 28 Jan 2026 17:19:34 +0000	[thread overview]
Message-ID: <5B14DFEB-18F2-46C0-8DD1-166A3BC275B4@oracle.com> (raw)
In-Reply-To: <20260127222442.GA379147@bhelgaas>

> On Thu, Jan 22, 2026 at 02:09:54PM +0100, Håkon Bugge wrote:
>> program_hpx_type2() is today unconditionally called, despite the fact
>> that when the _HPX was added to the ACPI spec. v3.0, the description
>> stated:
>> 
>> OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM
>> is not controlling the PCI Express Advanced Error Reporting
>> capability.
>> 
>> Hence, we only call program_hpx_type2() when the OSPM owns the PCIe
>> hotplug capability but not the AER.
>> 
>> The Advanced Configuration and Power Interface (ACPI) Specification
>> version 6.6 has a provision that gives the OSPM the ability to control
>> the other PCIe Device Control bits any way. In a note in section
>> 6.2.9, it is stated:
>> 
>> "OSPM may override the settings provided by the _HPX object's Type2
>> record (PCI Express Settings) or Type3 record (PCI Express Descriptor
>> Settings) when OSPM has assumed native control of the corresponding
>> feature."
>> 
>> So, in order to preserve the non-AER bits in PCIe Device Control, in
>> particular the performance sensitive ExtTag and RO, we make sure
>> program_hpx_type2() if called, doesn't modify any non-AER bits.
>> 
>> Also, when program_hpx_type2() is called, we completely avoid
>> modifying any bits in the Link Control register. However, if the _HPX
>> type 2 records contains bits indicating such modifications, we print
>> an info message.
>> 
>> [1] Operating System-directed configuration and Power Management
> 
> I looked at the specs again and pulled out some more details.  Here's
> what seemed relevant to me:
> 
>    PCI/ACPI: Restrict program_hpx_type2() to AER bits
> 
>    Previously program_hpx_type2() applied PCIe settings unconditionally, which
>    could incorrectly change bits like Extended Tag Field Enable and Enable
>    Relaxed Ordering.
> 
>    When _HPX was added to ACPI r3.0, the intent of the PCIe Setting Record
>    (Type 2) in sec 6.2.7.3 was to configure AER registers when the OS does not
>    own the AER Capability:
> 
>      The PCI Express setting record contains ... [the AER] Uncorrectable Error
>      Mask, Uncorrectable Error Severity, Correctable Error Mask ... to be used
>      when configuring registers in the Advanced Error Reporting Extended
>      Capability Structure ...
> 
>      OSPM will only evaluate _HPX with Setting Record – Type 2 if OSPM is not
>      controlling the PCI Express Advanced Error Reporting capability.
> 
>    ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers in
>    the PCIe Capability with AER-related bits, and the restriction that the OS
>    use this only when it owns PCIe native hotplug:
> 
>      ... when configuring PCI Express registers in the Advanced Error
>      Reporting Extended Capability Structure *or PCI Express Capability
>      Structure* ...
> 
>      An OS that has assumed ownership of native hot plug but does not ... have
>      ownership of the AER register set must use ... the Type 2 record to
>      program the AER registers ...
> 
>      However, since the Type 2 record also includes register bits that have
>      functions other than AER, the OS must ignore values ... that are not
>      applicable.
> 
>    Restrict program_hpx_type2() to only the intended purpose:
> 
>      - Apply settings only when OS owns PCIe native hotplug but not AER,
> 
>      - Only touch the AER-related bits (Error Reporting Enables) in Device
>        Control
> 
>      - Don't touch Link Control at all, since nothing there seems AER-related,
>        but log _HPX settings for debugging purposes
> 
>    Note that Read Completion Boundary is now configured elsewhere, since it is
>    unrelated to _HPX.

Thanks for the word-smithing and improved accuracy!

>> +	/* Log if _HPX attempts to modify PCIe Link Control register */
>> 	if (pcie_cap_has_lnkctl(dev)) {
>> -
>> -		/*
>> -		 * If the Root Port supports Read Completion Boundary of
>> -		 * 128, set RCB to 128.  Otherwise, clear it.
>> -		 */
>> -		hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB;
>> -		hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB;
>> -		if (pcie_root_rcb_set(dev))
>> -			hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB;
>> -
>> -		pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL,
>> -			~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or);

This was what confused me a lot, the bit-wise NOT above. That must be wrong, as pcie_capability_clear_and_set_word() inverts the "clear" argument.

>> +		if (hpx->pci_exp_lnkctl_and)
>> +			pci_info(dev,
>> +				 "_HPX attempts to reset the following bits in PCIe Link Control: 0x%04x\n",
>> +				 hpx->pci_exp_lnkctl_and);
>> +		if (hpx->pci_exp_lnkctl_or)
>> +			pci_info(dev,
>> +				 "_HPX attempts to set the following bits in PCIe Link Control: 0x%04x\n",
>> +				 hpx->pci_exp_lnkctl_or);
> 
> Again I'm afraid I suggested some over-engineering, and even worse, I
> misinterpreted the pci_exp_lnkctl_and and pci_exp_lnkctl_or usage when
> I said "if pci_exp_lnkctl_and or pci_exp_lnkctl_or are *non-zero*".
> 
> Per spec, we are supposed to AND the register with pci_exp_lnkctl_and,
> so the interesting value would be anything other than 0xffff.  Since
> we OR it with pci_exp_lnkctl_or, the interesting values there would be
> anything non-zero.  So I think what we would want is something like
> this:
> 
> +	/* Log if _HPX attempts to modify PCIe Link Control register */
> 	if (pcie_cap_has_lnkctl(dev)) {
> -
> -		/*
> -		 * If the Root Port supports Read Completion Boundary of
> -		 * 128, set RCB to 128.  Otherwise, clear it.
> -		 */
> -		hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB;
> -		hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB;
> -		if (pcie_root_rcb_set(dev))
> -			hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB;
> -
> -		pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL,
> -			~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or);
> +		if (hpx->pci_exp_lnkctl_and != 0xffff ||
> +		    hpx->pci_exp_lnkctl_or != 0)
> +			pci_info(dev, "_HPX attempts Link Control setting (AND %#06x OR %#06x)\n",
> +				 hpx->pci_exp_lnkctl_and,
> +				 hpx->pci_exp_lnkctl_or);
> 	}

Since we do not want to fix incorrect firmware in this respect, the above makes perfect sense to me. A v4 is on its way.


Thxs, Håkon


  reply	other threads:[~2026-01-28 22:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260122130957.68757-1-haakon.bugge@oracle.com>
2026-01-22 13:09 ` [PATCH v3 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits Håkon Bugge
2026-01-27 22:24   ` Bjorn Helgaas
2026-01-28 17:19     ` Haakon Bugge [this message]
2026-01-29 16:36       ` Haakon Bugge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B14DFEB-18F2-46C0-8DD1-166A3BC275B4@oracle.com \
    --to=haakon.bugge@oracle.com \
    --cc=alex@shazbot.org \
    --cc=bhelgaas@google.com \
    --cc=gregkh@suse.de \
    --cc=helgaas@kernel.org \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=morbidrsa@gmail.com \
    --cc=oohall@gmail.com \
    --cc=rafael@kernel.org \
    --cc=schnelle@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox