public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/24] PCI: Convert all dynamic sysfs attributes to static
@ 2026-04-11  8:01 Krzysztof Wilczyński
  2026-04-11  8:01 ` [PATCH v4 01/24] PCI/sysfs: Use PCI resource accessor macros Krzysztof Wilczyński
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: Krzysztof Wilczyński @ 2026-04-11  8:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
	Magnus Lindholm, Matt Turner, Richard Henderson, Christophe Leroy,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Dexuan Cui, Krzysztof Hałasa, Lukas Wunner,
	Oliver O'Halloran, Saurabh Singh Sengar, Shuan He,
	Srivatsa Bhat, Ilpo Järvinen, linux-pci, linux-alpha,
	linuxppc-dev

Hello,

This series converts every dynamically allocated PCI sysfs attribute to
a static const definition.  After the full series, pci_sysfs_init() and
sysfs_initialized are gone, and every sysfs file is created by the
driver model at device_add() time.

Currently, the PCI resource files (resourceN, resourceN_wc) and the
legacy bus files (legacy_io, legacy_mem) are created dynamically
from two unsynchronised paths:

Path A: late_initcall

  pci_sysfs_init                        (late_initcall)
    sysfs_initialized = 1
    for_each_pci_dev
      pci_create_sysfs_dev_files
        sysfs_create_bin_file           (resourceN, resourceN_wc)
    pci_find_next_bus
      pci_create_legacy_files
        sysfs_create_bin_file           (legacy_io, legacy_mem)

Path B: device registration / hotplug

  pci_bus_add_devices
    pci_bus_add_device
      pci_create_sysfs_dev_files
        if (!sysfs_initialized) return  <- only guard
        sysfs_create_bin_file           (resourceN, resourceN_wc)

On most ACPI systems this does not race because PCI enumeration
completes at subsys_initcall time, before pci_sysfs_init() runs:

  subsys_initcall (level 4):
    acpi_pci_root_add
      pci_bus_add_device
        pci_create_sysfs_dev_files
          if (!sysfs_initialized)          <- not yet set
            return -EACCES

  late_initcall (level 7):
    pci_sysfs_init
      sysfs_initialized = 1
      for_each_pci_dev
        pci_create_sysfs_dev_files         <- creates the files, no race

On Devicetree platforms the host controller is a platform driver that
probes via the driver model, often on a workqueue, and overlaps with the
late_initcall:

  CPU 0 (late_initcall)                CPU 1 (driver probe)
  ---------------------------          ----------------------------
  pci_sysfs_init()
    sysfs_initialized = 1
    for_each_pci_dev(pdev)             pci_bus_add_device(pdev)
      pci_create_sysfs_dev_files()       pci_create_sysfs_dev_files()
        sysfs_create_bin_file()            sysfs_create_bin_file()
                                             -> "duplicate filename"

The same happens on ACPI when probing is asynchronous (hv_pci on
Azure, RISC-V with ACPI).

The duplicate causes sysfs_create_bin_file() to fail with -EEXIST.
pci_create_resource_files() then calls pci_remove_resource_files() in
its error unwind, tearing down files the other thread created and
still references through pdev->res_attr[].  This has caused kernel
panics on i.MX6 and boot failures on other platforms.

Several different fixes have been proposed over the years: reordering
the sysfs_initialized assignment, adding locks, checking
pci_dev_is_added(), setting pdev->res_attr[] to NULL after kfree
(which only prevents a double-free on the teardown path, not the
error unwind removing the other thread's files).  None would address the
root cause.

This has been reported a few times:

  - https://lore.kernel.org/linux-pci/20250702155112.40124-1-heshuan@bytedance.com/
  - https://lore.kernel.org/linux-pci/b51519d6-ce45-4b6d-8135-c70169bd110e@h-partners.com/
  - https://lore.kernel.org/linux-pci/1702093576-30405-1-git-send-email-ssengar@linux.microsoft.com/
  - https://lore.kernel.org/linux-pci/SY0P300MB04687548090B73E40AF97D8897B82@SY0P300MB0468.AUSP300.PROD.OUTLOOK.COM/
  - https://lore.kernel.org/linux-pci/20230105174736.GA1154719@bhelgaas/
  - https://lore.kernel.org/linux-pci/m3eebg9puj.fsf@t19.piap.pl/
  - https://lore.kernel.org/linux-pci/20200716110423.xtfyb3n6tn5ixedh@pali/
  - https://lore.kernel.org/linux-pci/1366196798-15929-1-git-send-email-artem.savkov@gmail.com/
  - https://bugzilla.kernel.org/show_bug.cgi?id=215515
  - https://bugzilla.kernel.org/show_bug.cgi?id=216888

With static attributes the driver model creates sysfs entries once per
device at device_add() time, under the device lock, eliminating the
late_initcall iteration and the race along with it.

	Krzysztof

---
Changes in v4:
  https://lore.kernel.org/linux-pci/20260410055040.39233-1-kwilczynski@kernel.org/

   - Added new Reviewed-by tags.
   - Added pci_resource_is_io() and pci_resource_is_mem() helpers
     for resource type checks, replacing the open-coded bitwise
     flag tests in pci_mmap_resource(), pci_resource_io(), and
     Alpha's pci_mmap_resource(), as per Ilpo Järvinen's
     suggestion.
   - Split the __pci_mmap_fits() cleanup into two patches.  An
     overflow fix for zero-length BARs, which now includes a
     Fixes: tag referencing the original Alpha PCI sysfs commit,
     and the WARN macro removal is a separate cleanup as per Ilpo
     Järvinen's suggestion.
   - Added a missing Fixes: tag to the Alpha lockdown check,
     referencing the commit that added the check to the generic
     path but missed Alpha's implementation.
   - Added PCI_LEGACY_IO_SIZE and PCI_LEGACY_MEM_SIZE macros to
     replace the raw literals used for legacy address space sizes.
     These are used in both Alpha's pci_mmap_legacy_page_range()
     and the static legacy attribute definitions, as per Ilpo
     Järvinen's suggestion.
   - Replaced sysfs_update_groups() in the BAR resize path with
     sysfs_remove_groups() before the resize and sysfs_create_groups()
     after, restoring the original teardown before BAR resize
     ordering.  This was reported by Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=7
   - Defined pci_dev_resource_attr_groups as a NULL macro when
     HAVE_PCI_MMAP and ARCH_GENERIC_PCI_MMAP_RESOURCE are both
     absent, so the resize path compiles unconditionally without
     #ifdef guards in the function body.  This was reported by
     Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=7
   - Moved the pci_legacy_has_sparse() prototype into the patch
     that introduces the function, alongside the existing
     pci_adjust_legacy_attr() declaration, to fix a bisection
     issue where Alpha would warn on -Wmissing-prototypes.
     This was reported by Sashiko, see:
     https://sashiko.dev/#/patchset/20260410055040.39233-1-kwilczynski%40kernel.org?part=18

Changes in v3:
  https://lore.kernel.org/linux-pci/20210910202623.2293708-1-kw@linux.com/

  - Updated for modern kernel releases and expanded scope.  The
    v2 only covered the generic resource files.  This version
    also converts Alpha's sparse/dense resource files and the
    legacy bus attributes, removing pci_sysfs_init() entirely.
  - Split the single macro definition into three distinct ones
    (per I/O, UC, and WC), to make sure that each carries only
    the callbacks its resource type needs.
  - Updated to use the new .bin_size callback, as the attributes
    are const, to replace using a->size directly, which was not
    ideal.  This required changes to pci_llseek_resource(), to
    ensure that it would work for device and bus-level attributes.
  - Updated the __resource_resize_store() to include CAP_SYS_ADMIN
    capabilities check.
  - Added the security_locked_down() check to Alpha's
    pci_mmap_resource(), to align with other architectures.

Changes in v2:
  https://lore.kernel.org/linux-pci/20210825212255.878043-1-kw@linux.com/

  - Refactored code so that the macros, helpers and internal
    functions can be used to correctly leverage the read(),
    write() and mmap() callbacks rather than to use the
    .is_bin_visible() callback to set up sysfs objects
    internals as this is not supported.
  - Refactored some if-statements to check for a resource
    flag first, and then call either arch_can_pci_mmap_io()
    or arch_can_pci_mmap_wc(), plus store result of testing
    for IORESOURCE_MEM and IORESOURCE_PREFETCH flags into
    a boolean variable, as per Bjorn Helgaas' suggestion.
  - Renamed pci_read_resource_io() and pci_write_resource_io()
    callbacks so that these are not specifically tied to I/O
    BARs read() and write() operations also as per Bjorn
    Helgaas' suggestion.
  - Updated style for code handling bitwise operations to
    match the style that is preferred as per Bjorn Helgaas'
    suggestion.
  - Updated commit messages adding more details about the
    implementation as requested by Bjorn Helgaas.

Krzysztof Wilczyński (24):
  PCI/sysfs: Use PCI resource accessor macros
  PCI: Add pci_resource_is_io() and pci_resource_is_mem() helpers
  PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers
  PCI/sysfs: Use BAR length in pci_llseek_resource() when attr->size is
    zero
  PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store()
  PCI/sysfs: Add static PCI resource attribute macros
  PCI/sysfs: Convert PCI resource files to static attributes
  PCI/sysfs: Convert __resource_resize_store() to use static attributes
  PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files()
  PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope
  alpha/PCI: Add security_locked_down() check to pci_mmap_resource()
  alpha/PCI: Use BAR index in sysfs attr->private instead of resource
    pointer
  alpha/PCI: Use PCI resource accessor macros
  alpha/PCI: Clean up pci_mmap_resource()
  alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs
  alpha/PCI: Remove WARN from __pci_mmap_fits()
  alpha/PCI: Add static PCI resource attribute macros
  alpha/PCI: Convert resource files to static attributes
  PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files()
  PCI: Add macros for legacy I/O and memory address space sizes
  alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range()
  PCI/sysfs: Add __weak pci_legacy_has_sparse() helper
  PCI/sysfs: Convert legacy I/O and memory attributes to static
    definitions
  PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init()

 arch/alpha/include/asm/pci.h   |  13 +-
 arch/alpha/kernel/pci-sysfs.c  | 373 +++++++++++----------
 arch/powerpc/include/asm/pci.h |   2 -
 drivers/pci/bus.c              |   1 -
 drivers/pci/pci-sysfs.c        | 575 +++++++++++++++++++--------------
 drivers/pci/pci.h              |  16 +-
 drivers/pci/probe.c            |   6 -
 drivers/pci/remove.c           |   3 -
 include/linux/pci.h            |  39 ++-
 9 files changed, 578 insertions(+), 450 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2026-04-11 10:40 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-11  8:01 [PATCH v4 00/24] PCI: Convert all dynamic sysfs attributes to static Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 01/24] PCI/sysfs: Use PCI resource accessor macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 02/24] PCI: Add pci_resource_is_io() and pci_resource_is_mem() helpers Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 03/24] PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 04/24] PCI/sysfs: Use BAR length in pci_llseek_resource() when attr->size is zero Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 05/24] PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 06/24] PCI/sysfs: Add static PCI resource attribute macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 07/24] PCI/sysfs: Convert PCI resource files to static attributes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 08/24] PCI/sysfs: Convert __resource_resize_store() to use " Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 09/24] PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 10/24] PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 11/24] alpha/PCI: Add security_locked_down() check to pci_mmap_resource() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 12/24] alpha/PCI: Use BAR index in sysfs attr->private instead of resource pointer Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 13/24] alpha/PCI: Use PCI resource accessor macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 14/24] alpha/PCI: Clean up pci_mmap_resource() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 15/24] alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 16/24] alpha/PCI: Remove WARN from __pci_mmap_fits() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 17/24] alpha/PCI: Add static PCI resource attribute macros Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 18/24] alpha/PCI: Convert resource files to static attributes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 19/24] PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 20/24] PCI: Add macros for legacy I/O and memory address space sizes Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 21/24] alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range() Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 22/24] PCI/sysfs: Add __weak pci_legacy_has_sparse() helper Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 23/24] PCI/sysfs: Convert legacy I/O and memory attributes to static definitions Krzysztof Wilczyński
2026-04-11  8:01 ` [PATCH v4 24/24] PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init() Krzysztof Wilczyński
2026-04-11 10:40 ` [PATCH v4 00/24] PCI: Convert all dynamic sysfs attributes to static Magnus Lindholm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox