From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Evangelos Petrongonas <epetron@amazon.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Alex Williamson <alex.williamson@redhat.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
David Matlack <dmatlack@google.com>,
Vipin Sharma <vipinsh@google.com>, Chris Li <chrisl@kernel.org>,
Jason Miu <jasonmiu@google.com>,
"Pratyush Yadav" <pratyush@kernel.org>,
Stanislav Spassov <stanspas@amazon.de>,
<linux-pci@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <nh-open-source@amazon.com>
Subject: Re: [RFC PATCH 03/13] pci: pcsc: infer cacheability of PCI capabilities
Date: Thu, 9 Oct 2025 15:17:36 +0100 [thread overview]
Message-ID: <20251009151736.00002f45@huawei.com> (raw)
In-Reply-To: <fb2f262e4733a56bf0ebd9ef5c9325880aea05cd.1759312886.git.epetron@amazon.de>
On Fri, 3 Oct 2025 09:00:39 +0000
Evangelos Petrongonas <epetron@amazon.de> wrote:
> Implement cacheability inference for PCI capabilities to
> determine which configuration space registers can be safely cached.
>
> The first 64 bytes of PCI configuration space follow a standardized
> format, allowing straightforward cacheability determination. For
> capability-specific registers, the implementation traverses the PCI
> capability list to identify supported capabilities.
>
> Cacheable registers are identified for the following capabilities:
> - Power Management (PM)
> - Message Signaled Interrupts (MSI)
> - Message Signaled Interrupts Extensions (MSI-X)
> - PCI Express
> - PCI Advanced Features (AF)
> - Enhanced Allocation (EA)
> - Vital Product Data (VPD)
> - Vendor Specific
>
> The implementation pre-populates the cache with known values including
> device/vendor IDs and header type to avoid unnecessary configuration
> space reads during initialization.
>
> We are currently not caching the Command/Status registers.
>
> The cacheability of all capabilities apart from MSI, are straightforward
> and can be deduced from the spec. Regarding MSI the MSI flags are read
> and based on this, the cacheability is inferred.
>
> Signed-off-by: Evangelos Petrongonas <epetron@amazon.de>
> Signed-off-by: Stanislav Spassov <stanspas@amazon.de>
A few minor things below.
> +
> +static void infer_capabilities_pointers(struct pci_dev *dev)
> +{
> + u8 pos, cap_id, next_cap;
> + u32 val;
> + int i;
> +
> + if (pcsc_hw_config_read(dev->bus, dev->devfn, PCI_CAPABILITY_LIST, 1,
> + &val) != PCIBIOS_SUCCESSFUL)
> + return;
> +
> + pos = (val & 0xFF) & ~0x3;
Given single byte read, shouldn't need the 0xFF masking.
Might be worth setting val = 0 at declaration though so that the compiler
can see it is assigned if we reach here.
> +
> + while (pos) {
> + if (pos < 0x40 || pos > 0xFE)
> + break;
> +
> + pos &= ~0x3;
I couldn't immediately find a spec statement of the bottom two bits being 0 for
the next capability pointers. (There is one for the PCI_CAPABILITY_LIST)
> + if (pcsc_hw_config_read(dev->bus, dev->devfn, pos, 2, &val) !=
> + PCIBIOS_SUCCESSFUL)
> + break;
> +
> + cap_id = val & 0xFF; /* PCI_CAP_LIST_ID */
> + next_cap = (val >> 8) & 0xFF; /* PCI_CAP_LIST_NEXT */
> +
> + bitmap_set(dev->pcsc->cachable_bitmask, pos, 2);
> + pcsc_update_byte(dev, pos, cap_id); /* PCI_CAP_LIST_ID */
> + pcsc_update_byte(dev, pos + 1,
> + next_cap); /* PCI_CAP_LIST_NEXT */
Could you do something like moving the bitmap_set before the pcsc_hw_config_read() and
cal pcsc_cached_config_read() to fill the cache directly during the read?
> +
> + pci_dbg(dev, "Capability ID %#x found at %#x\n", cap_id, pos);
> +
> + /* Check if this is a supported capability and infer cacheability */
> + for (i = 0; i < ARRAY_SIZE(PCSC_SUPPORTED_CAPABILITIES); i++) {
> + if (cap_id == PCSC_SUPPORTED_CAPABILITIES[i]) {
> + infer_capability_cacheability(dev, pos, cap_id);
> + break;
> + }
> + }
> +
> + /* Move to next capability */
> + pos = next_cap;
> + }
> +}
> +
> +static void infer_cacheability(struct pci_dev *dev)
> +{
> + if (WARN_ON(!dev || !dev->pcsc || !dev->pcsc->cfg_space))
> + return;
> +
> + bitmap_zero(dev->pcsc->cachable_bitmask, PCSC_CFG_SPC_SIZE);
> +
> + /* Type 0 Configuration Space Header */
> + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL) {
Unless you plan to add type 1 very soon I'd flip the logic to reduce
indent of the code that follows.
> + /*
> + * Mark cacheable registers in the PCI configuration space header.
> + * We cache read-only and rarely changing registers:
> + * - PCI_VENDOR_ID, PCI_DEVICE_ID (0x00-0x03)
> + * - PCI_CLASS_REVISION through PCI_CAPABILITY_LIST (0x08-0x34)
> + * Includes: CLASS_REVISION, CACHE_LINE_SIZE, LATENCY_TIMER,
> + * HEADER_TYPE, BIST, BASE_ADDRESS_0-5, CARDBUS_CIS,
> + * SUBSYSTEM_VENDOR_ID, SUBSYSTEM_ID, ROM_ADDRESS, CAPABILITY_LIST
> + * - PCI_INTERRUPT_LINE through PCI_MAX_LAT (0x3c-0x3f)
> + * Includes: INTERRUPT_LINE, INTERRUPT_PIN, MIN_GNT, MAX_LAT
> + */
> + bitmap_set(dev->pcsc->cachable_bitmask, PCI_VENDOR_ID, 4);
> + bitmap_set(dev->pcsc->cachable_bitmask, PCI_CLASS_REVISION, 45);
For this large range can you derive that 45 as something like
PCI_CAPABILITY_LIST - PCI_CLASS_REVISION + 1
Same applies for the other multifield bitmap sets
> + bitmap_set(dev->pcsc->cachable_bitmask, PCI_INTERRUPT_LINE, 4);
> +
> + /* Pre populate the cache with the values that we already know */
I'm curious - do you have perf numbers to show it's worth writing these 5 bytes
directly into the cache? Feels like complexity that maybe doesn't belong there
in initial patch but should come along later in a patch with numbers to support
the small amount of extra complexity.
> + pcsc_update_byte(dev, PCI_HEADER_TYPE,
> + dev->hdr_type |
> + (dev->multifunction ? 0x80 : 0));
> +
> + /*
> + * SR-IOV VFs must return 0xFFFF (PCI_ANY_ID) for vendor/device ID
> + * registers per PCIe spec.
> + */
> + if (dev->is_virtfn) {
> + pcsc_update_byte(dev, PCI_VENDOR_ID, 0xFF);
> + pcsc_update_byte(dev, PCI_VENDOR_ID + 1, 0xFF);
> + pcsc_update_byte(dev, PCI_DEVICE_ID, 0xFF);
> + pcsc_update_byte(dev, PCI_DEVICE_ID + 1, 0xFF);
> + } else {
> + if (dev->vendor != PCI_ANY_ID) {
> + pcsc_update_byte(dev, PCI_VENDOR_ID,
> + dev->vendor & 0xFF);
> + pcsc_update_byte(dev, PCI_VENDOR_ID + 1,
> + (dev->vendor >> 8) & 0xFF);
> + }
> + if (dev->device != PCI_ANY_ID) {
> + pcsc_update_byte(dev, PCI_DEVICE_ID,
> + dev->device & 0xFF);
> + pcsc_update_byte(dev, PCI_DEVICE_ID + 1,
> + (dev->device >> 8) & 0xFF);
> + }
> + }
> +
> + infer_capabilities_pointers(dev);
> + }
> +}
next prev parent reply other threads:[~2025-10-09 14:17 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-03 9:00 [RFC PATCH 00/13] Introduce PCI Configuration Space Cache (PCSC) Evangelos Petrongonas
2025-10-03 9:00 ` [RFC PATCH 01/13] pci: pcsc: Add plumbing for the " Evangelos Petrongonas
2025-10-09 12:38 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 02/13] pci: pcsc: implement basic functionality Evangelos Petrongonas
2025-10-09 13:12 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 03/13] pci: pcsc: infer cacheability of PCI capabilities Evangelos Petrongonas
2025-10-09 14:17 ` Jonathan Cameron [this message]
2025-10-03 9:00 ` [RFC PATCH 04/13] pci: pcsc: infer PCIe extended capabilities Evangelos Petrongonas
2025-10-09 14:24 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 05/13] pci: pcsc: control the cache via sysfs and kernel params Evangelos Petrongonas
2025-10-09 14:41 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 06/13] pci: pcsc: handle device resets Evangelos Petrongonas
2025-10-09 14:49 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 07/13] pci: pcsc: introduce statistic gathering tools Evangelos Petrongonas
2025-10-09 14:56 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 08/13] pci: Save only spec-defined configuration space Evangelos Petrongonas
2025-10-04 1:59 ` kernel test robot
2025-10-09 14:58 ` Jonathan Cameron
2025-10-03 9:00 ` [RFC PATCH 09/13] vfio: pci: Fill only spec-defined configuration space regions Evangelos Petrongonas
2025-10-03 9:00 ` [RFC PATCH 10/13] pci: pcsc: Use contiguous pages for the cache data Evangelos Petrongonas
2025-10-03 9:00 ` [RFC PATCH 11/13] pci: pcsc: Add kexec persistence support via KHO Evangelos Petrongonas
2025-10-03 9:00 ` [RFC PATCH 12/13] pci: pcsc: introduce persistence versioning Evangelos Petrongonas
2025-10-03 9:00 ` [RFC PATCH 13/13] pci: pcsc: introduce hashtable lookup to speed up restoration Evangelos Petrongonas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251009151736.00002f45@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=chrisl@kernel.org \
--cc=dmatlack@google.com \
--cc=epetron@amazon.de \
--cc=jasonmiu@google.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=nh-open-source@amazon.com \
--cc=pasha.tatashin@soleen.com \
--cc=pratyush@kernel.org \
--cc=rafael@kernel.org \
--cc=stanspas@amazon.de \
--cc=vipinsh@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.