public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: "Krzysztof Wilczyński" <kwilczynski@kernel.org>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Magnus Lindholm" <linmag7@gmail.com>,
	"Matt Turner" <mattst88@gmail.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Christophe Leroy" <chleroy@kernel.org>,
	"Madhavan Srinivasan" <maddy@linux.ibm.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Dexuan Cui" <decui@microsoft.com>,
	"Krzysztof Hałasa" <khalasa@piap.pl>,
	"Lukas Wunner" <lukas@wunner.de>,
	"Oliver O'Halloran" <oohall@gmail.com>,
	"Saurabh Singh Sengar" <ssengar@microsoft.com>,
	"Shuan He" <heshuan@bytedance.com>,
	"Srivatsa Bhat" <srivatsabhat@microsoft.com>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
	linux-pci@vger.kernel.org, linux-alpha@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Subject: [PATCH v4 00/24] PCI: Convert all dynamic sysfs attributes to static
Date: Sat, 11 Apr 2026 08:01:24 +0000	[thread overview]
Message-ID: <20260411080148.471335-1-kwilczynski@kernel.org> (raw)

Hello,

This series converts every dynamically allocated PCI sysfs attribute to
a static const definition.  After the full series, pci_sysfs_init() and
sysfs_initialized are gone, and every sysfs file is created by the
driver model at device_add() time.

Currently, the PCI resource files (resourceN, resourceN_wc) and the
legacy bus files (legacy_io, legacy_mem) are created dynamically
from two unsynchronised paths:

Path A: late_initcall

  pci_sysfs_init                        (late_initcall)
    sysfs_initialized = 1
    for_each_pci_dev
      pci_create_sysfs_dev_files
        sysfs_create_bin_file           (resourceN, resourceN_wc)
    pci_find_next_bus
      pci_create_legacy_files
        sysfs_create_bin_file           (legacy_io, legacy_mem)

Path B: device registration / hotplug

  pci_bus_add_devices
    pci_bus_add_device
      pci_create_sysfs_dev_files
        if (!sysfs_initialized) return  <- only guard
        sysfs_create_bin_file           (resourceN, resourceN_wc)

On most ACPI systems this does not race because PCI enumeration
completes at subsys_initcall time, before pci_sysfs_init() runs:

  subsys_initcall (level 4):
    acpi_pci_root_add
      pci_bus_add_device
        pci_create_sysfs_dev_files
          if (!sysfs_initialized)          <- not yet set
            return -EACCES

  late_initcall (level 7):
    pci_sysfs_init
      sysfs_initialized = 1
      for_each_pci_dev
        pci_create_sysfs_dev_files         <- creates the files, no race

On Devicetree platforms the host controller is a platform driver that
probes via the driver model, often on a workqueue, and overlaps with the
late_initcall:

  CPU 0 (late_initcall)                CPU 1 (driver probe)
  ---------------------------          ----------------------------
  pci_sysfs_init()
    sysfs_initialized = 1
    for_each_pci_dev(pdev)             pci_bus_add_device(pdev)
      pci_create_sysfs_dev_files()       pci_create_sysfs_dev_files()
        sysfs_create_bin_file()            sysfs_create_bin_file()
                                             -> "duplicate filename"

The same happens on ACPI when probing is asynchronous (hv_pci on
Azure, RISC-V with ACPI).

The duplicate causes sysfs_create_bin_file() to fail with -EEXIST.
pci_create_resource_files() then calls pci_remove_resource_files() in
its error unwind, tearing down files the other thread created and
still references through pdev->res_attr[].  This has caused kernel
panics on i.MX6 and boot failures on other platforms.

Several different fixes have been proposed over the years: reordering
the sysfs_initialized assignment, adding locks, checking
pci_dev_is_added(), setting pdev->res_attr[] to NULL after kfree
(which only prevents a double-free on the teardown path, not the
error unwind removing the other thread's files).  None would address the
root cause.

This has been reported a few times:

  - https://lore.kernel.org/linux-pci/20250702155112.40124-1-heshuan@bytedance.com/
  - https://lore.kernel.org/linux-pci/b51519d6-ce45-4b6d-8135-c70169bd110e@h-partners.com/
  - https://lore.kernel.org/linux-pci/1702093576-30405-1-git-send-email-ssengar@linux.microsoft.com/
  - https://lore.kernel.org/linux-pci/SY0P300MB04687548090B73E40AF97D8897B82@SY0P300MB0468.AUSP300.PROD.OUTLOOK.COM/
  - https://lore.kernel.org/linux-pci/20230105174736.GA1154719@bhelgaas/
  - https://lore.kernel.org/linux-pci/m3eebg9puj.fsf@t19.piap.pl/
  - https://lore.kernel.org/linux-pci/20200716110423.xtfyb3n6tn5ixedh@pali/
  - https://lore.kernel.org/linux-pci/1366196798-15929-1-git-send-email-artem.savkov@gmail.com/
  - https://bugzilla.kernel.org/show_bug.cgi?id=215515
  - https://bugzilla.kernel.org/show_bug.cgi?id=216888

With static attributes the driver model creates sysfs entries once per
device at device_add() time, under the device lock, eliminating the
late_initcall iteration and the race along with it.

	Krzysztof

---
Changes in v4:
  https://lore.kernel.org/linux-pci/20260410055040.39233-1-kwilczynski@kernel.org/

   - Added new Reviewed-by tags.
   - Added pci_resource_is_io() and pci_resource_is_mem() helpers
     for resource type checks, replacing the open-coded bitwise
     flag tests in pci_mmap_resource(), pci_resource_io(), and
     Alpha's pci_mmap_resource(), as per Ilpo Järvinen's
     suggestion.
   - Split the __pci_mmap_fits() cleanup into two patches.  An
     overflow fix for zero-length BARs, which now includes a
     Fixes: tag referencing the original Alpha PCI sysfs commit,
     and the WARN macro removal is a separate cleanup as per Ilpo
     Järvinen's suggestion.
   - Added a missing Fixes: tag to the Alpha lockdown check,
     referencing the commit that added the check to the generic
     path but missed Alpha's implementation.
   - Added PCI_LEGACY_IO_SIZE and PCI_LEGACY_MEM_SIZE macros to
     replace the raw literals used for legacy address space sizes.
     These are used in both Alpha's pci_mmap_legacy_page_range()
     and the static legacy attribute definitions, as per Ilpo
     Järvinen's suggestion.
   - Replaced sysfs_update_groups() in the BAR resize path with
     sysfs_remove_groups() before the resize and sysfs_create_groups()
     after, restoring the original teardown before BAR resize
     ordering.  This was reported by Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=7
   - Defined pci_dev_resource_attr_groups as a NULL macro when
     HAVE_PCI_MMAP and ARCH_GENERIC_PCI_MMAP_RESOURCE are both
     absent, so the resize path compiles unconditionally without
     #ifdef guards in the function body.  This was reported by
     Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=7
   - Moved the pci_legacy_has_sparse() prototype into the patch
     that introduces the function, alongside the existing
     pci_adjust_legacy_attr() declaration, to fix a bisection
     issue where Alpha would warn on -Wmissing-prototypes.
     This was reported by Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=18

Changes in v3:
  https://lore.kernel.org/linux-pci/20210910202623.2293708-1-kw@linux.com/

  - Updated for modern kernel releases and expanded scope.  The
    v2 only covered the generic resource files.  This version
    also converts Alpha's sparse/dense resource files and the
    legacy bus attributes, removing pci_sysfs_init() entirely.
  - Split the single macro definition into three distinct ones
    (per I/O, UC, and WC), to make sure that each carries only
    the callbacks its resource type needs.
  - Updated to use the new .bin_size callback, as the attributes
    are const, to replace using a->size directly, which was not
    ideal.  This required changes to pci_llseek_resource(), to
    ensure that it would work for device and bus-level attributes.
  - Updated the __resource_resize_store() to include CAP_SYS_ADMIN
    capabilities check.
  - Added the security_locked_down() check to Alpha's
    pci_mmap_resource(), to align with other architectures.

Changes in v2:
  https://lore.kernel.org/linux-pci/20210825212255.878043-1-kw@linux.com/

  - Refactored code so that the macros, helpers and internal
    functions can be used to correctly leverage the read(),
    write() and mmap() callbacks rather than to use the
    .is_bin_visible() callback to set up sysfs objects
    internals as this is not supported.
  - Refactored some if-statements to check for a resource
    flag first, and then call either arch_can_pci_mmap_io()
    or arch_can_pci_mmap_wc(), plus store result of testing
    for IORESOURCE_MEM and IORESOURCE_PREFETCH flags into
    a boolean variable, as per Bjorn Helgaas' suggestion.
  - Renamed pci_read_resource_io() and pci_write_resource_io()
    callbacks so that these are not specifically tied to I/O
    BARs read() and write() operations also as per Bjorn
    Helgaas' suggestion.
  - Updated style for code handling bitwise operations to
    match the style that is preferred as per Bjorn Helgaas'
    suggestion.
  - Updated commit messages adding more details about the
    implementation as requested by Bjorn Helgaas.

Krzysztof Wilczyński (24):
  PCI/sysfs: Use PCI resource accessor macros
  PCI: Add pci_resource_is_io() and pci_resource_is_mem() helpers
  PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers
  PCI/sysfs: Use BAR length in pci_llseek_resource() when attr->size is
    zero
  PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store()
  PCI/sysfs: Add static PCI resource attribute macros
  PCI/sysfs: Convert PCI resource files to static attributes
  PCI/sysfs: Convert __resource_resize_store() to use static attributes
  PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files()
  PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope
  alpha/PCI: Add security_locked_down() check to pci_mmap_resource()
  alpha/PCI: Use BAR index in sysfs attr->private instead of resource
    pointer
  alpha/PCI: Use PCI resource accessor macros
  alpha/PCI: Clean up pci_mmap_resource()
  alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs
  alpha/PCI: Remove WARN from __pci_mmap_fits()
  alpha/PCI: Add static PCI resource attribute macros
  alpha/PCI: Convert resource files to static attributes
  PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files()
  PCI: Add macros for legacy I/O and memory address space sizes
  alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range()
  PCI/sysfs: Add __weak pci_legacy_has_sparse() helper
  PCI/sysfs: Convert legacy I/O and memory attributes to static
    definitions
  PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init()

 arch/alpha/include/asm/pci.h   |  13 +-
 arch/alpha/kernel/pci-sysfs.c  | 373 +++++++++++----------
 arch/powerpc/include/asm/pci.h |   2 -
 drivers/pci/bus.c              |   1 -
 drivers/pci/pci-sysfs.c        | 575 +++++++++++++++++++--------------
 drivers/pci/pci.h              |  16 +-
 drivers/pci/probe.c            |   6 -
 drivers/pci/remove.c           |   3 -
 include/linux/pci.h            |  39 ++-
 9 files changed, 578 insertions(+), 450 deletions(-)

-- 
2.53.0


             reply	other threads:[~2026-04-11  8:01 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-11  8:01 Krzysztof Wilczyński [this message]
2026-04-11  8:01 ` [PATCH v4 01/24] PCI/sysfs: Use PCI resource accessor macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 02/24] PCI: Add pci_resource_is_io() and pci_resource_is_mem() helpers Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 03/24] PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 04/24] PCI/sysfs: Use BAR length in pci_llseek_resource() when attr->size is zero Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 05/24] PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 06/24] PCI/sysfs: Add static PCI resource attribute macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 07/24] PCI/sysfs: Convert PCI resource files to static attributes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 08/24] PCI/sysfs: Convert __resource_resize_store() to use " Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 09/24] PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 10/24] PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 11/24] alpha/PCI: Add security_locked_down() check to pci_mmap_resource() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 12/24] alpha/PCI: Use BAR index in sysfs attr->private instead of resource pointer Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 13/24] alpha/PCI: Use PCI resource accessor macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 14/24] alpha/PCI: Clean up pci_mmap_resource() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 15/24] alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 16/24] alpha/PCI: Remove WARN from __pci_mmap_fits() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 17/24] alpha/PCI: Add static PCI resource attribute macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 18/24] alpha/PCI: Convert resource files to static attributes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 19/24] PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 20/24] PCI: Add macros for legacy I/O and memory address space sizes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 21/24] alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 22/24] PCI/sysfs: Add __weak pci_legacy_has_sparse() helper Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 23/24] PCI/sysfs: Convert legacy I/O and memory attributes to static definitions Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 24/24] PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init() Krzysztof Wilczyński
2026-04-11 10:40 ` [PATCH v4 00/24] PCI: Convert all dynamic sysfs attributes to static Magnus Lindholm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260411080148.471335-1-kwilczynski@kernel.org \
    --to=kwilczynski@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=chleroy@kernel.org \
    --cc=decui@microsoft.com \
    --cc=helgaas@kernel.org \
    --cc=heshuan@bytedance.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=khalasa@piap.pl \
    --cc=linmag7@gmail.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lpieralisi@kernel.org \
    --cc=lukas@wunner.de \
    --cc=maddy@linux.ibm.com \
    --cc=mani@kernel.org \
    --cc=mattst88@gmail.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=oohall@gmail.com \
    --cc=richard.henderson@linaro.org \
    --cc=srivatsabhat@microsoft.com \
    --cc=ssengar@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox