From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4531D30BF5C; Fri, 10 Apr 2026 05:50:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775800245; cv=none; b=XOKr1dSzw25J/Fs7QuyS2zpR6ldtb0ia9Iir2SM9PRXD4gaDJrUjVm+H1DOKjptjrufvkJ4W3xwehy4Czt4nUze0odzxSVehbqPuddLeCFXR//6zeiRcBqvsnfamrVMGJfPsiFfQ9MQyfEfOGFPXTPScq+QBIpLHLjD4y7OQ2VQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775800245; c=relaxed/simple; bh=heNFs70N3BP1J7/YAXV0i5l1cHEi6uhW1IDN/z9Cms0=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=Z2mrTYjAxgyr/xUhTLXcyemYTDAFjYWWNaOLXAo/n4kIPn6VuZSJ10S3VzZXOjhVkIN9PwTFU08wx2Ry5LjL/MJUI/karuLiT1qY2dq5PiDkgYJ2C8HhB6POouPKQlCtJb2q3yJg9uyV//PgaeTRLU/4laYKmmnO2iOGWsJBLIg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TdZG2f9t; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TdZG2f9t" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B723BC19421; Fri, 10 Apr 2026 05:50:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775800244; bh=heNFs70N3BP1J7/YAXV0i5l1cHEi6uhW1IDN/z9Cms0=; h=From:To:Cc:Subject:Date:From; b=TdZG2f9txWpdj7lY67wEZFVQXWBe+rLFAMovErKrjBn6+8wWffZgTOjB0EUSFV8bU xvP+MIZPndc6dlAsTGQVncui+vuuY8GzJUI/p7K1ZLLrXIhyObLCt8VjBHYI13lFj8 seKMHcHg2hsuKm586Myz1bqU0SjwxlqGO6OXshoYBUa7n04PTJkoKo/BAXAFeVEhyB ojCBMdpG5+V/psVi7tdM+uv2iK8yoXCpoJQi1JkhhH09Z8iQB/HkmkeI7lfEJUau57 wjD5UBH1KQgjs526NJql8iCUdJboPZw4qw3Q34UPLMm5sBIAjAGviCn1T4rBiS2rP1 s1LT9RZ6z3+DA== From: =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= To: Bjorn Helgaas Cc: Bjorn Helgaas , Manivannan Sadhasivam , Lorenzo Pieralisi , Magnus Lindholm , Matt Turner , Richard Henderson , Christophe Leroy , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Dexuan Cui , =?UTF-8?q?Krzysztof=20Ha=C5=82asa?= , Lukas Wunner , "Oliver O'Halloran" , Saurabh Singh Sengar , Shuan He , Srivatsa Bhat , =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= , linux-pci@vger.kernel.org, linux-alpha@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH 00/20] PCI: Convert all dynamic sysfs attributes to static Date: Fri, 10 Apr 2026 05:50:20 +0000 Message-ID: <20260410055040.39233-1-kwilczynski@kernel.org> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hello, This series converts every dynamically allocated PCI sysfs attribute to a static const definition. After the full series, pci_sysfs_init() and sysfs_initialized are gone, and every sysfs file is created by the driver model at device_add() time. Currently, the PCI resource files (resourceN, resourceN_wc) and the legacy bus files (legacy_io, legacy_mem) are created dynamically from two unsynchronised paths: Path A: late_initcall pci_sysfs_init (late_initcall) sysfs_initialized = 1 for_each_pci_dev pci_create_sysfs_dev_files sysfs_create_bin_file (resourceN, resourceN_wc) pci_find_next_bus pci_create_legacy_files sysfs_create_bin_file (legacy_io, legacy_mem) Path B: device registration / hotplug pci_bus_add_devices pci_bus_add_device pci_create_sysfs_dev_files if (!sysfs_initialized) return <- only guard sysfs_create_bin_file (resourceN, resourceN_wc) On most ACPI systems this does not race because PCI enumeration completes at subsys_initcall time, before pci_sysfs_init() runs: subsys_initcall (level 4): acpi_pci_root_add pci_bus_add_device pci_create_sysfs_dev_files if (!sysfs_initialized) <- not yet set return -EACCES late_initcall (level 7): pci_sysfs_init sysfs_initialized = 1 for_each_pci_dev pci_create_sysfs_dev_files <- creates the files, no race On Devicetree platforms the host controller is a platform driver that probes via the driver model, often on a workqueue, and overlaps with the late_initcall: CPU 0 (late_initcall) CPU 1 (driver probe) --------------------------- ---------------------------- pci_sysfs_init() sysfs_initialized = 1 for_each_pci_dev(pdev) pci_bus_add_device(pdev) pci_create_sysfs_dev_files() pci_create_sysfs_dev_files() sysfs_create_bin_file() sysfs_create_bin_file() -> "duplicate filename" The same happens on ACPI when probing is asynchronous (hv_pci on Azure, RISC-V with ACPI). The duplicate causes sysfs_create_bin_file() to fail with -EEXIST. pci_create_resource_files() then calls pci_remove_resource_files() in its error unwind, tearing down files the other thread created and still references through pdev->res_attr[]. This has caused kernel panics on i.MX6 and boot failures on other platforms. Several different fixes have been proposed over the years: reordering the sysfs_initialized assignment, adding locks, checking pci_dev_is_added(), setting pdev->res_attr[] to NULL after kfree (which only prevents a double-free on the teardown path, not the error unwind removing the other thread's files). None would address the root cause. This has been reported a few times: - https://lore.kernel.org/linux-pci/20250702155112.40124-1-heshuan@bytedance.com/ - https://lore.kernel.org/linux-pci/b51519d6-ce45-4b6d-8135-c70169bd110e@h-partners.com/ - https://lore.kernel.org/linux-pci/1702093576-30405-1-git-send-email-ssengar@linux.microsoft.com/ - https://lore.kernel.org/linux-pci/SY0P300MB04687548090B73E40AF97D8897B82@SY0P300MB0468.AUSP300.PROD.OUTLOOK.COM/ - https://lore.kernel.org/linux-pci/20230105174736.GA1154719@bhelgaas/ - https://lore.kernel.org/linux-pci/m3eebg9puj.fsf@t19.piap.pl/ - https://lore.kernel.org/linux-pci/20200716110423.xtfyb3n6tn5ixedh@pali/ - https://lore.kernel.org/linux-pci/1366196798-15929-1-git-send-email-artem.savkov@gmail.com/ - https://bugzilla.kernel.org/show_bug.cgi?id=215515 - https://bugzilla.kernel.org/show_bug.cgi?id=216888 With static attributes the driver model creates sysfs entries once per device at device_add() time, under the device lock, eliminating the late_initcall iteration and the race along with it. Krzysztof --- Changes in v3: https://lore.kernel.org/linux-pci/20210910202623.2293708-1-kw@linux.com/ - Updated for modern kernel releases and expanded scope. The v2 only covered the generic resource files. This version also converts Alpha's sparse/dense resource files and the legacy bus attributes, removing pci_sysfs_init() entirely. - Split the single macro definition into three distinct ones (per I/O, UC, and WC), to make sure that each carries only the callbacks its resource type needs. - Updated to use the new .bin_size callback, as the attributes are const, to replace using a->size directly, which was not ideal. This required changes to pci_llseek_resource(), to ensure that it would work for device and bus-level attributes. - Updated the __resource_resize_store() to include CAP_SYS_ADMIN capabilities check. - Added the security_locked_down() check to Alpha's pci_mmap_resource(), to align with other architectures. Changes in v2: https://lore.kernel.org/linux-pci/20210825212255.878043-1-kw@linux.com/ - Refactored code so that the macros, helpers and internal functions can be used to correctly leverage the read(), write() and mmap() callbacks rather than to use the .is_bin_visible() callback to set up sysfs objects internals as this is not supported. - Refactored some if-statements to check for a resource flag first, and then call either arch_can_pci_mmap_io() or arch_can_pci_mmap_wc(), plus store result of testing for IORESOURCE_MEM and IORESOURCE_PREFETCH flags into a boolean variable, as per Bjorn Helgaas' suggestion. - Renamed pci_read_resource_io() and pci_write_resource_io() callbacks so that these are not specifically tied to I/O BARs read() and write() operations also as per Bjorn Helgaas' suggestion. - Updated style for code handling bitwise operations to match the style that is preferred as per Bjorn Helgaas' suggestion. - Updated commit messages adding more details about the implementation as requested by Bjorn Helgaas. Krzysztof WilczyƄski (20): PCI/sysfs: Use PCI resource accessor macros PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers PCI/sysfs: Use BAR length in pci_llseek_resource() when attr->size is zero PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store() PCI/sysfs: Add static PCI resource attribute macros PCI/sysfs: Convert PCI resource files to static attributes PCI/sysfs: Convert __resource_resize_store() to use static attributes PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files() PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope alpha/PCI: Add security_locked_down() check to pci_mmap_resource() alpha/PCI: Use BAR index in sysfs attr->private instead of resource pointer alpha/PCI: Use PCI resource accessor macros alpha/PCI: Clean up __pci_mmap_fits() alpha/PCI: Add static PCI resource attribute macros alpha/PCI: Convert resource files to static attributes PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files() alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range() PCI/sysfs: Add __weak pci_legacy_has_sparse() helper PCI/sysfs: Convert legacy I/O and memory attributes to static definitions PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init() arch/alpha/include/asm/pci.h | 13 +- arch/alpha/kernel/pci-sysfs.c | 369 +++++++++++---------- arch/powerpc/include/asm/pci.h | 2 - drivers/pci/bus.c | 1 - drivers/pci/pci-sysfs.c | 575 +++++++++++++++++++-------------- drivers/pci/pci.h | 16 +- drivers/pci/probe.c | 6 - drivers/pci/remove.c | 3 - include/linux/pci.h | 9 - 9 files changed, 545 insertions(+), 449 deletions(-) -- 2.53.0