linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL v2] PCI changes for v6.17
@ 2025-08-01 14:22 Bjorn Helgaas
  2025-08-01 21:37 ` pr-tracker-bot
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2025-08-01 14:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-pci, linux-kernel, Rob Herring, Lorenzo Pieralisi,
	Manivannan Sadhasivam, Krzysztof Wilczyński

The following changes since commit 66db1d3cbdb0e89f8a3b5d06a8defb25d1c3f836:

  PCI/pwrctrl: Add optional slot clock for PCI slots (2025-06-13 16:59:52 -0500)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git tags/pci-v6.17-changes

for you to fetch changes up to 58d2b6b6b214d8b4914cd4c821a8bd0c75436c2c:

  Merge branch 'pci/misc' (2025-07-31 16:12:19 -0500)

This is based on v6.16-rc1, but you have 66db1d3cbdb0 ("PCI/pwrctrl: Add
optional slot clock for PCI slots") already via 0dc89a25468e ("Merge tag
'renesas-dts-for-v6.17-tag1' of https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt")

You should see a minor conflict in drivers/pci/pci.c between upstream
commit:

  907a7a2e5bf4 ("PCI/PM: Set up runtime PM even for devices without PCI PM")

and this commit from the PCI tree:

  5c0d0ee36f16 ("PCI: Support Immediate Readiness on devices without PM capabilities")

My resolution (same as linux-next) is here:

  https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git/tree/drivers/pci/pci.c?id=7b76d8fd6a64#n3213

Relative to the first pull request, this drops the pci/capability-search
branch, which caused a regression on s390:
https://lore.kernel.org/r/4e10bea3aa91ee721bb40e9388e8f72f930908fe.camel@linux.ibm.com

----------------------------------------------------------------

Enumeration:

  - Allow built-in drivers, not just modular drivers, to use async initial
    probing (Lukas Wunner)

  - Support Immediate Readiness even on devices with no PM Capability (Sean
    Christopherson)

  - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the
    required delay between a reset and sending config requests to a device
    (Niklas Cassel)

  - Add pci_is_display() to check for "Display" base class and use it in
    ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

  - Allow 'isolated PCI functions' (multi-function devices without a
    function 0) for LoongArch, similar to s390 and jailhouse (Huacai Chen)

Power control:

  - Add ability to enable optional slot clock for cases where the PCIe host
    controller and the slot are supplied by different clocks (Marek Vasut)

PCIe native device hotplug:

  - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
    misinterpreting a config read failure after a device has been removed
    (Lukas Wunner)

  - Avoid creating a useless PCIe port service device for pciehp if the
    slot is handled by the ACPI hotplug driver (Lukas Wunner)

  - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug
    ports (Lukas Wunner)

Virtualization:

  - Save VF resizable BAR state and restore it after reset (Michał
    Winiarski)

  - Allow IOV resources (VF BARs) to be resized (Michał Winiarski)

  - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size
    (Michał Winiarski)

Endpoint framework:

  - Add RC-to-EP doorbell support using platform MSI controller, including
    a test case (Frank Li)

  - Allow BAR assignment via configfs so platforms have flexibility in
    determining BAR usage (Jerome Brunet)

Native PCIe controller drivers:

  - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie, axis,artpec6-pcie,
    marvell,armada-3700-pcie, st,spear1340-pcie to DT schema format (Rob
    Herring)

  - Use dev_fwnode() instead of of_fwnode_handle() to remove OF dependency
    in altera (fixes an unused variable), designware-host, mediatek,
    mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma, xilinx-nwl (Jiri
    Slaby, Arnd Bergmann)

  - Convert aardvark, altera, brcmstb, designware-host, iproc, mediatek,
    mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx, xilinx-dma,
    xilinx-nwl from using pci_msi_create_irq_domain() to using
    msi_create_parent_irq_domain() instead; this makes the interrupt
    controller per-PCI device, allows dynamic allocation of vectors after
    initialization, and allows support of IMS (Nam Cao)

APM X-Gene PCIe controller driver:

  - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug
    bits, use device-managed memory allocations, and clean things up (Marc
    Zyngier)

  - Probe xgene-msi as a standard platform driver rather than a
    subsys_initcall (Marc Zyngier)

Broadcom STB PCIe controller driver:

  - Add optional DT 'num-lanes' property and if present, use it to override
    the Maximum Link Width advertised in Link Capabilities (Jim Quinlan)

Cadence PCIe controller driver:

  - Use PCIe Message routing types from the PCI core rather than defining
    private ones (Hans Zhang)

Freescale i.MX6 PCIe controller driver:

  - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu)

  - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features
    (Richard Zhu)

  - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can
    trigger doorbel on Endpoint (Frank Li)

  - Remove apps_reset (LTSSM_EN) from
    imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug
    regression on i.MX8MM (Richard Zhu)

  - Delay Endpoint link start until configfs 'start' written (Richard Zhu)

Intel VMD host bridge driver:

  - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo)

Qualcomm PCIe controller driver:

  - Add DT binding and driver support for SA8255p, which supports ECAM for
    Configuration Space access (Mayank Rana)

  - Update DT binding and driver to describe PHYs and per-Root Port resets
    in a Root Port stanza and deprecate describing them in the host bridge;
    this makes it possible to support multiple Root Ports in the future
    (Krishna Chaitanya Chundru)

  - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang)

  - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang)

  - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT bindings
    (Konrad Dybcio)

  - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue Zhang)

  - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas
    Cassel)

Rockchip PCIe controller driver:

  - Drop unused PCIe Message routing and code definitions (Hans Zhang)

  - Remove several unused header includes (Hans Zhang)

  - Use standard PCIe config register definitions instead of
    rockchip-specific redefinitions (Geraldo Nascimento)

  - Set Target Link Speed to 5.0 GT/s before retraining so we have a chance
    to train at a higher speed (Geraldo Nascimento)

Rockchip DesignWare PCIe controller driver:

  - Prevent race between link training and register update via DBI by
    inhibiting link training after hot reset and link down (Wilfred
    Mallawa)

  - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas
    Cassel)

Sophgo PCIe controller driver:

  - Add DT binding and driver for Sophgo SG2044 PCIe controller driver in
    Root Complex mode (Inochi Amaoto)

Synopsys DesignWare PCIe controller driver:

  - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
    Ports that support > 5.0 GT/s.  Slower Ports still rely on the
    not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while
    waiting for the Link (Niklas Cassel)

----------------------------------------------------------------
Akshay Jindal (1):
      PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit

Bartosz Golaszewski (1):
      PCI/pwrctrl: Fix the kerneldoc tag for private fields

Bjorn Helgaas (28):
      PCI: Fix typos
      Merge branch 'pci/aer'
      Merge branch 'pci/aspm'
      Merge branch 'pci/boot-display'
      Merge branch 'pci/enumeration'
      Merge branch 'pci/hotplug'
      Merge branch 'pci/iommu'
      Merge branch 'pci/pwrctrl'
      Merge branch 'pci/resources'
      Merge branch 'pci/dt-bindings'
      Merge branch 'pci/endpoint/core'
      Merge branch 'pci/endpoint/doorbell'
      Merge branch 'pci/endpoint/epf-vntb'
      Merge branch 'pci/controller/msi-parent'
      Merge branch 'pci/controller/linkup-fix'
      Merge branch 'pci/controller/brcmstb'
      Merge branch 'pci/controller/cadence'
      Merge branch 'pci/controller/dwc'
      Merge branch 'pci/controller/dw-rockchip'
      Merge branch 'pci/controller/imx6'
      Merge branch 'pci/controller/mvebu'
      Merge branch 'pci/controller/qcom'
      Merge branch 'pci/controller/rockchip'
      Merge branch 'pci/controller/rockchip-host'
      Merge branch 'pci/controller/sophgo'
      Merge branch 'pci/controller/vmd'
      Merge branch 'pci/controller/xgene'
      Merge branch 'pci/misc'

Damien Le Moal (2):
      PCI: endpoint: Fix configfs group list head handling
      PCI: endpoint: Fix configfs group removal on driver teardown

Florian Fainelli (2):
      MAINTAINERS: Drop Nicolas from maintaining pcie-brcmstb
      PCI: brcmstb: Replace open coded value with PCIE_T_RRS_READY_MS

Frank Li (8):
      PCI: imx6: Add helper function imx_pcie_add_lut_by_rid()
      PCI: imx6: Add LUT configuration for MSI/IOMMU in Endpoint mode
      PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
      PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
      PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
      PCI: endpoint: pci-epf-test: Add doorbell test support
      misc: pci_endpoint_test: Add doorbell test case
      selftests: pci_endpoint: Add doorbell test case

George D Sworo (1):
      PCI: vmd: Add VMD Device ID Support for Panther Lake (PTL)-H/P/U

Geraldo Nascimento (2):
      PCI: rockchip: Use standard PCIe definitions
      PCI: rockchip: Set Target Link Speed to 5.0 GT/s before retraining

Guilherme Giacomo Simoes (1):
      PCI: hotplug: Remove TODO about unused .get_power(), .hardware_test()

Hans Zhang (10):
      PCI: rockchip-host: Fix "Unexpected Completion" log message
      PCI: rockchip-host: Correct non-fatal error log message
      PCI: rockchip-host: Remove unused header includes
      dt-bindings: PCI: pci-ep: Extend max-link-speed to PCIe Gen5/Gen6
      PCI/ASPM: Use boolean type for aspm_disabled and aspm_force
      PCI/ASPM: Consolidate variable declaration and initialization
      PCI/AER: Use bool for AER disable state tracking
      PCI: cadence: Replace private message routing enums with PCI core definitions
      PCI: rockchip: Remove redundant PCIe message routing definitions
      PCI: dwc: Simplify the return value of PTM debugfs functions returning bool

Huacai Chen (1):
      PCI: Extend isolated function probing to LoongArch

Inochi Amaoto (2):
      dt-bindings: pci: Add Sophgo SG2044 PCIe host
      PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode

Jerome Brunet (3):
      PCI: endpoint: pci-epf-vntb: Return -ENOENT if pci_epc_get_next_free_bar() fails
      PCI: endpoint: pci-epf-vntb: Align MW naming with config names
      PCI: endpoint: pci-epf-vntb: Allow BAR assignment via configfs

Jim Quinlan (2):
      dt-bindings: PCI: brcm,stb-pcie: Add num-lanes property
      PCI: brcmstb: Set MLW based on "num-lanes" DT property if present

Jiri Slaby (SUSE) (1):
      PCI: controller: Use dev_fwnode() instead of of_fwnode_handle()

Jiwei Sun (2):
      PCI: Fix link speed calculation on retrain failure
      PCI: Adjust the position of reading the Link Control 2 register

Konrad Dybcio (2):
      dt-bindings: PCI: qcom,pcie-sc8180x: Drop unrelated clocks from PCIe hosts
      dt-bindings: PCI: qcom,pcie-sm8150: Drop unrelated clocks from PCIe hosts

Krishna Chaitanya Chundru (2):
      dt-bindings: PCI: qcom: Move PHY & reset GPIO to Root Port node
      PCI: qcom: Add support for parsing the new Root Port binding

Lukas Wunner (5):
      PCI: Allow built-in drivers to use async initial probing
      PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
      PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
      PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
      PCI: Move is_pciehp check out of pciehp_is_native()

Manivannan Sadhasivam (2):
      PCI: dwc: Make dw_pcie_ptm_ops static
      PCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute

Marc Zyngier (13):
      genirq: Teach handle_simple_irq() to resend an in-progress interrupt
      PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet
      PCI: xgene: Drop useless conditional compilation
      PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN
      PCI: xgene-msi: Make per-CPU interrupt setup robust
      PCI: xgene-msi: Drop superfluous fields from xgene_msi structure
      PCI: xgene-msi: Use device-managed memory allocations
      PCI: xgene-msi: Get rid of intermediate tracking structure
      PCI: xgene-msi: Sanitise MSI allocation and affinity setting
      PCI: xgene-msi: Resend an MSI racing with itself on a different CPU
      PCI: xgene-msi: Probe as a standard platform driver
      PCI: xgene-msi: Restructure handler setup/teardown
      cpu/hotplug: Remove unused cpuhp_state CPUHP_PCI_XGENE_DEAD

Mario Limonciello (5):
      PCI: Add pci_is_display() to check if device is a display controller
      vfio/pci: Use pci_is_display()
      vga_switcheroo: Use pci_is_display()
      iommu/vt-d: Use pci_is_display()
      ALSA: hda: Use pci_is_display()

Mayank Rana (4):
      PCI: dwc: Export DWC MSI controller related APIs
      PCI: host-generic: Rename and export gen_pci_init() for PCIe controller drivers
      dt-bindings: PCI: qcom,pcie-sa8255p: Document ECAM compliant PCIe root complex
      PCI: qcom: Add support for Qualcomm SA8255p based PCIe Root Complex

Michał Winiarski (5):
      PCI/IOV: Restore VF resizable BAR state after reset
      PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource
      PCI/IOV: Allow IOV resources to be resized in pci_resize_resource()
      PCI/IOV: Check that VF BAR fits within the reservation
      PCI/IOV: Allow drivers to control VF BAR size

Nam Cao (15):
      PCI: dwc: Switch to msi_create_parent_irq_domain()
      PCI: mobiveil: Switch to msi_create_parent_irq_domain()
      PCI: aardvark: Switch to msi_create_parent_irq_domain()
      PCI: altera-msi: Switch to msi_create_parent_irq_domain()
      PCI: brcmstb: Switch to msi_create_parent_irq_domain()
      PCI: iproc: Switch to msi_create_parent_irq_domain()
      PCI: mediatek-gen3: Switch to msi_create_parent_irq_domain()
      PCI: mediatek: Switch to msi_create_parent_irq_domain()
      PCI: rcar-host: Switch to msi_create_parent_irq_domain()
      PCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()
      PCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()
      PCI: xilinx: Switch to msi_create_parent_irq_domain()
      PCI: plda: Switch to msi_create_parent_irq_domain()
      PCI: vmd: Convert to lock guards
      PCI: vmd: Switch to msi_create_parent_irq_domain()

Niklas Cassel (6):
      PCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS
      PCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS
      PCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
      PCI: qcom: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
      PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up
      PCI: Move link up wait time and max retries macros to pci.h

Richard Zhu (4):
      PCI: imx6: Add IMX8MQ_EP third 64-bit BAR in epc_features
      PCI: imx6: Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features
      PCI: imx6: Remove apps_reset toggling from imx_pcie_{assert/deassert}_core_reset
      PCI: imx6: Delay link start until configfs 'start' written

Rob Herring (Arm) (6):
      dt-bindings: PCI: Convert st,spear1340-pcie to DT schema
      dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
      dt-bindings: PCI: Convert apm,xgene-pcie to DT schema
      dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
      dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
      dt-bindings: PCI: Remove 83xx-512x-pci.txt

Robin Murphy (1):
      PCI: Fix driver_managed_dma check

Salah Triki (1):
      PCI: mvebu: Use devm_add_action_or_reset() instead of devm_add_action()

Sean Christopherson (1):
      PCI: Support Immediate Readiness on devices without PM capabilities

Wilfred Mallawa (1):
      PCI: dw-rockchip: Delay link training after hot reset in EP mode

Ziyue Zhang (3):
      dt-bindings: PCI: qcom,pcie-sm8150: Document QCS615
      dt-bindings: PCI: qcom,pcie-sa8775p: Document QCS8300
      dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset

 Documentation/PCI/endpoint/pci-test-howto.rst      |  15 +
 .../devicetree/bindings/pci/83xx-512x-pci.txt      |  39 --
 .../devicetree/bindings/pci/aardvark-pci.txt       |  59 ---
 .../bindings/pci/amazon,al-alpine-v3-pcie.yaml     |  71 ++++
 .../devicetree/bindings/pci/apm,xgene-pcie.yaml    |  84 ++++
 .../devicetree/bindings/pci/axis,artpec6-pcie.txt  |  50 ---
 .../devicetree/bindings/pci/axis,artpec6-pcie.yaml | 118 ++++++
 .../devicetree/bindings/pci/brcm,stb-pcie.yaml     |   4 +
 .../bindings/pci/marvell,armada-3700-pcie.yaml     |  99 +++++
 Documentation/devicetree/bindings/pci/pci-ep.yaml  |   2 +-
 Documentation/devicetree/bindings/pci/pcie-al.txt  |  46 ---
 .../devicetree/bindings/pci/qcom,pcie-common.yaml  |  32 +-
 .../devicetree/bindings/pci/qcom,pcie-sa8255p.yaml | 122 ++++++
 .../devicetree/bindings/pci/qcom,pcie-sa8775p.yaml |  18 +-
 .../devicetree/bindings/pci/qcom,pcie-sc7280.yaml  |  16 +-
 .../devicetree/bindings/pci/qcom,pcie-sc8180x.yaml |  14 +-
 .../devicetree/bindings/pci/qcom,pcie-sm8150.yaml  |  21 +-
 .../devicetree/bindings/pci/snps,dw-pcie.yaml      |   2 +-
 .../bindings/pci/sophgo,sg2044-pcie.yaml           | 122 ++++++
 .../devicetree/bindings/pci/spear13xx-pcie.txt     |  14 -
 .../devicetree/bindings/pci/st,spear1340-pcie.yaml |  45 +++
 .../devicetree/bindings/pci/xgene-pci.txt          |  50 ---
 MAINTAINERS                                        |   7 +-
 drivers/gpu/vga/vga_switcheroo.c                   |   2 +-
 drivers/iommu/intel/iommu.c                        |   2 +-
 drivers/misc/pci_endpoint_test.c                   |  83 ++++
 drivers/pci/bus.c                                  |   5 +-
 drivers/pci/controller/Kconfig                     |  11 +
 drivers/pci/controller/cadence/pcie-cadence-ep.c   |   2 +-
 drivers/pci/controller/cadence/pcie-cadence.h      |  20 -
 drivers/pci/controller/dwc/Kconfig                 |  12 +
 drivers/pci/controller/dwc/Makefile                |   1 +
 drivers/pci/controller/dwc/pci-imx6.c              |  40 +-
 .../pci/controller/dwc/pcie-designware-debugfs.c   |  16 +-
 drivers/pci/controller/dwc/pcie-designware-host.c  | 103 ++---
 drivers/pci/controller/dwc/pcie-designware.c       |  14 +-
 drivers/pci/controller/dwc/pcie-designware.h       |  19 +-
 drivers/pci/controller/dwc/pcie-dw-rockchip.c      |  16 +-
 drivers/pci/controller/dwc/pcie-qcom.c             | 327 ++++++++++++++--
 drivers/pci/controller/dwc/pcie-sophgo.c           | 257 +++++++++++++
 drivers/pci/controller/mobiveil/Kconfig            |   1 +
 .../pci/controller/mobiveil/pcie-mobiveil-host.c   |  48 +--
 drivers/pci/controller/mobiveil/pcie-mobiveil.h    |   1 -
 drivers/pci/controller/pci-aardvark.c              |  57 ++-
 drivers/pci/controller/pci-host-common.c           |   5 +-
 drivers/pci/controller/pci-host-common.h           |   2 +
 drivers/pci/controller/pci-mvebu.c                 |   6 +-
 drivers/pci/controller/pci-xgene-msi.c             | 426 ++++++++-------------
 drivers/pci/controller/pci-xgene.c                 |  33 +-
 drivers/pci/controller/pcie-altera-msi.c           |  43 +--
 drivers/pci/controller/pcie-altera.c               |   3 +-
 drivers/pci/controller/pcie-brcmstb.c              |  80 ++--
 drivers/pci/controller/pcie-iproc-msi.c            |  44 +--
 drivers/pci/controller/pcie-mediatek-gen3.c        |  64 ++--
 drivers/pci/controller/pcie-mediatek.c             |  48 ++-
 drivers/pci/controller/pcie-rcar-host.c            |  68 ++--
 drivers/pci/controller/pcie-rockchip-ep.c          |   4 +-
 drivers/pci/controller/pcie-rockchip-host.c        |  64 ++--
 drivers/pci/controller/pcie-rockchip.h             |  26 +-
 drivers/pci/controller/pcie-xilinx-dma-pl.c        |  47 +--
 drivers/pci/controller/pcie-xilinx-nwl.c           |  44 +--
 drivers/pci/controller/pcie-xilinx.c               |  54 +--
 drivers/pci/controller/plda/Kconfig                |   1 +
 drivers/pci/controller/plda/pcie-plda-host.c       |  43 +--
 drivers/pci/controller/plda/pcie-plda.h            |   1 -
 drivers/pci/controller/plda/pcie-starfive.c        |   2 +-
 drivers/pci/controller/vmd.c                       | 249 ++++++------
 drivers/pci/endpoint/Kconfig                       |   8 +
 drivers/pci/endpoint/Makefile                      |   1 +
 drivers/pci/endpoint/functions/pci-epf-test.c      | 130 +++++++
 drivers/pci/endpoint/functions/pci-epf-vntb.c      | 144 ++++++-
 drivers/pci/endpoint/pci-ep-cfs.c                  |   1 +
 drivers/pci/endpoint/pci-ep-msi.c                  | 100 +++++
 drivers/pci/endpoint/pci-epf-core.c                |  40 +-
 drivers/pci/hotplug/TODO                           |   4 -
 drivers/pci/hotplug/pciehp_hpc.c                   |   2 +-
 drivers/pci/iov.c                                  | 153 +++++++-
 drivers/pci/msi/msi.c                              |   2 +-
 drivers/pci/pci-acpi.c                             |   7 +-
 drivers/pci/pci-driver.c                           |   6 +-
 drivers/pci/pci.c                                  |  30 +-
 drivers/pci/pci.h                                  |  84 +++-
 drivers/pci/pcie/aer.c                             |   7 +-
 drivers/pci/pcie/aspm.c                            |  11 +-
 drivers/pci/pcie/portdrv.c                         |   2 +-
 drivers/pci/pcie/ptm.c                             |   2 +-
 drivers/pci/probe.c                                |  12 +-
 drivers/pci/quirks.c                               |   6 +-
 drivers/pci/setup-bus.c                            |   3 +-
 drivers/pci/setup-res.c                            |  35 +-
 drivers/vfio/pci/vfio_pci_igd.c                    |   3 +-
 include/linux/cpuhotplug.h                         |   1 -
 include/linux/hypervisor.h                         |   3 +
 include/linux/pci-ep-msi.h                         |  28 ++
 include/linux/pci-epf.h                            |  18 +
 include/linux/pci-pwrctrl.h                        |   2 +-
 include/linux/pci.h                                |  27 ++
 include/linux/pci_hotplug.h                        |   3 +-
 include/uapi/linux/pci_regs.h                      |   9 +
 include/uapi/linux/pcitest.h                       |   1 +
 kernel/irq/chip.c                                  |   8 +-
 sound/hda/hdac_i915.c                              |   2 +-
 sound/pci/hda/hda_intel.c                          |   4 +-
 .../selftests/pci_endpoint/pci_endpoint_test.c     |  28 ++
 104 files changed, 2996 insertions(+), 1375 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/pci/83xx-512x-pci.txt
 delete mode 100644 Documentation/devicetree/bindings/pci/aardvark-pci.txt
 create mode 100644 Documentation/devicetree/bindings/pci/amazon,al-alpine-v3-pcie.yaml
 create mode 100644 Documentation/devicetree/bindings/pci/apm,xgene-pcie.yaml
 delete mode 100644 Documentation/devicetree/bindings/pci/axis,artpec6-pcie.txt
 create mode 100644 Documentation/devicetree/bindings/pci/axis,artpec6-pcie.yaml
 create mode 100644 Documentation/devicetree/bindings/pci/marvell,armada-3700-pcie.yaml
 delete mode 100644 Documentation/devicetree/bindings/pci/pcie-al.txt
 create mode 100644 Documentation/devicetree/bindings/pci/qcom,pcie-sa8255p.yaml
 create mode 100644 Documentation/devicetree/bindings/pci/sophgo,sg2044-pcie.yaml
 delete mode 100644 Documentation/devicetree/bindings/pci/spear13xx-pcie.txt
 create mode 100644 Documentation/devicetree/bindings/pci/st,spear1340-pcie.yaml
 delete mode 100644 Documentation/devicetree/bindings/pci/xgene-pci.txt
 create mode 100644 drivers/pci/controller/dwc/pcie-sophgo.c
 create mode 100644 drivers/pci/endpoint/pci-ep-msi.c
 create mode 100644 include/linux/pci-ep-msi.h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-01 14:22 [GIT PULL v2] PCI changes for v6.17 Bjorn Helgaas
@ 2025-08-01 21:37 ` pr-tracker-bot
  2025-08-07  3:34   ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: pr-tracker-bot @ 2025-08-01 21:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linus Torvalds, linux-pci, linux-kernel, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam,
	Krzysztof Wilczyński

The pull request you sent on Fri, 1 Aug 2025 09:22:54 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git tags/pci-v6.17-changes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0bd0a41a5120f78685a132834865b0a631b9026a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-01 21:37 ` pr-tracker-bot
@ 2025-08-07  3:34   ` Ammar Faizi
  2025-08-07  3:51     ` Lukas Wunner
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-07  3:34 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linus Torvalds, Linux PCI Mailing List, Linux Kernel Mailing List,
	Rob Herring, Lorenzo Pieralisi, Manivannan Sadhasivam,
	Krzysztof Wilczyński, Armando Budianto,
	Alviro Iskandar Setiawan, gwml

On Fri, 01 Aug 2025 21:37:28 +0000, pr-tracker-bot@kernel.org, wrote:
>
> The pull request you sent on Fri, 1 Aug 2025 09:22:54 -0500:
>
> > git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git tags/pci-v6.17-changes
>
> has been merged into torvalds/linux.git:
> https://git.kernel.org/torvalds/c/0bd0a41a5120f78685a132834865b0a631b9026a

Yesterday, I synced with Linus' tree, but couldn't boot. Crashed with
this call trace:

  https://gist.githubusercontent.com/ammarfaizi2/3ba41f13517be4bae70cde869347d259/raw/0ac09b3e1d90d51c3fed14ca9f837f45d7730f0a/crash.jpg

This morning, I synced with Linus' tree again, still the same result.

I suspect it's related to pci. I'm still bisecting. I've successfully
narrowed it down to this pci pull.

  0bd0a41a5120 (refs/bisect/bad) Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
  db5f0c3e3e60 ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace)
  12d518961586 tracing: Use __free(kfree) in trace.c to remove gotos
  debe57fbe12c tracing: Add guard() around locks and mutexes in trace.c
  788fa4b47cdc tracing: Add guard(ring_buffer_nest)
  c89504a703fb tracing: Remove unneeded goto out logic
  877d94c74e4c (refs/bisect/good-877d94c74e4c6665d2af55c0154363b43b947e60) Merge tag 'linux-watchdog-6.17-rc1' of git://www.linux-watchdog.org/linux-watchd

Now, I am testing:

  769ce531faa6 (HEAD) Merge branch 'pci/controller/msi-parent'

I'll be back to it later. git bisect says:

  Bisecting: 65 revisions left to test after this (roughly 6 steps)

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  3:34   ` Ammar Faizi
@ 2025-08-07  3:51     ` Lukas Wunner
  2025-08-07  4:44       ` Nam Cao
  2025-08-07  4:54       ` Ammar Faizi
  0 siblings, 2 replies; 20+ messages in thread
From: Lukas Wunner @ 2025-08-07  3:51 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Bjorn Helgaas, Linus Torvalds, Linux PCI Mailing List,
	Linux Kernel Mailing List, Rob Herring, Lorenzo Pieralisi,
	Manivannan Sadhasivam, Krzysztof Wilczynski, Armando Budianto,
	Alviro Iskandar Setiawan, gwml, Nam Cao

On Thu, Aug 07, 2025 at 10:34:08AM +0700, Ammar Faizi wrote:
> On Fri, 01 Aug 2025 21:37:28 +0000, pr-tracker-bot@kernel.org, wrote:
> >
> > The pull request you sent on Fri, 1 Aug 2025 09:22:54 -0500:
> >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git tags/pci-v6.17-changes
> >
> > has been merged into torvalds/linux.git:
> > https://git.kernel.org/torvalds/c/0bd0a41a5120f78685a132834865b0a631b9026a
> 
> Yesterday, I synced with Linus' tree, but couldn't boot. Crashed with
> this call trace:
> 
>   https://gist.githubusercontent.com/ammarfaizi2/3ba41f13517be4bae70cde869347d259/raw/0ac09b3e1d90d51c3fed14ca9f837f45d7730f0a/crash.jpg
> 
> This morning, I synced with Linus' tree again, still the same result.
> 
> I suspect it's related to pci. I'm still bisecting. I've successfully
> narrowed it down to this pci pull.
> 
>   0bd0a41a5120 (refs/bisect/bad) Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>   db5f0c3e3e60 ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace)
>   12d518961586 tracing: Use __free(kfree) in trace.c to remove gotos
>   debe57fbe12c tracing: Add guard() around locks and mutexes in trace.c
>   788fa4b47cdc tracing: Add guard(ring_buffer_nest)
>   c89504a703fb tracing: Remove unneeded goto out logic
>   877d94c74e4c (refs/bisect/good-877d94c74e4c6665d2af55c0154363b43b947e60) Merge tag 'linux-watchdog-6.17-rc1' of git://www.linux-watchdog.org/linux-watchd
> 
> Now, I am testing:
> 
>   769ce531faa6 (HEAD) Merge branch 'pci/controller/msi-parent'
> 
> I'll be back to it later. git bisect says:
> 
>   Bisecting: 65 revisions left to test after this (roughly 6 steps)

Kenneth reports early-stage reboots caused by d7d8ab87e3e
("PCI: vmd: Switch to msi_create_parent_irq_domain()"):

https://lore.kernel.org/all/dfa40e48-8840-4e61-9fda-25cdb3ad81c1@panix.com/

Perhaps you're witnessing the same issue?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  3:51     ` Lukas Wunner
@ 2025-08-07  4:44       ` Nam Cao
  2025-08-07  4:54       ` Ammar Faizi
  1 sibling, 0 replies; 20+ messages in thread
From: Nam Cao @ 2025-08-07  4:44 UTC (permalink / raw)
  To: Lukas Wunner, Ammar Faizi
  Cc: Bjorn Helgaas, Linus Torvalds, Linux PCI Mailing List,
	Linux Kernel Mailing List, Rob Herring, Lorenzo Pieralisi,
	Manivannan Sadhasivam, Krzysztof Wilczynski, Armando Budianto,
	Alviro Iskandar Setiawan, gwml

Lukas Wunner <lukas@wunner.de> writes:

> On Thu, Aug 07, 2025 at 10:34:08AM +0700, Ammar Faizi wrote:
>> On Fri, 01 Aug 2025 21:37:28 +0000, pr-tracker-bot@kernel.org, wrote:
>> >
>> > The pull request you sent on Fri, 1 Aug 2025 09:22:54 -0500:
>> >
>> > > git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git tags/pci-v6.17-changes
>> >
>> > has been merged into torvalds/linux.git:
>> > https://git.kernel.org/torvalds/c/0bd0a41a5120f78685a132834865b0a631b9026a
>> 
>> Yesterday, I synced with Linus' tree, but couldn't boot. Crashed with
>> this call trace:
>> 
>>   https://gist.githubusercontent.com/ammarfaizi2/3ba41f13517be4bae70cde869347d259/raw/0ac09b3e1d90d51c3fed14ca9f837f45d7730f0a/crash.jpg
>> 
>> This morning, I synced with Linus' tree again, still the same result.
>> 
>> I suspect it's related to pci. I'm still bisecting. I've successfully
>> narrowed it down to this pci pull.
>> 
>>   0bd0a41a5120 (refs/bisect/bad) Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>>   db5f0c3e3e60 ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace)
>>   12d518961586 tracing: Use __free(kfree) in trace.c to remove gotos
>>   debe57fbe12c tracing: Add guard() around locks and mutexes in trace.c
>>   788fa4b47cdc tracing: Add guard(ring_buffer_nest)
>>   c89504a703fb tracing: Remove unneeded goto out logic
>>   877d94c74e4c (refs/bisect/good-877d94c74e4c6665d2af55c0154363b43b947e60) Merge tag 'linux-watchdog-6.17-rc1' of git://www.linux-watchdog.org/linux-watchd
>> 
>> Now, I am testing:
>> 
>>   769ce531faa6 (HEAD) Merge branch 'pci/controller/msi-parent'
>> 
>> I'll be back to it later. git bisect says:
>> 
>>   Bisecting: 65 revisions left to test after this (roughly 6 steps)
>
> Kenneth reports early-stage reboots caused by d7d8ab87e3e
> ("PCI: vmd: Switch to msi_create_parent_irq_domain()"):
>
> https://lore.kernel.org/all/dfa40e48-8840-4e61-9fda-25cdb3ad81c1@panix.com/
>
> Perhaps you're witnessing the same issue?

Thanks for the Cc. The backtrace does look like something that the
commit would cause.

Let me stare at it.

Nam

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  3:51     ` Lukas Wunner
  2025-08-07  4:44       ` Nam Cao
@ 2025-08-07  4:54       ` Ammar Faizi
  2025-08-07  5:03         ` Nam Cao
  1 sibling, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-07  4:54 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Nam Cao, Bjorn Helgaas, Linus Torvalds, Linux PCI Mailing List,
	Linux Kernel Mailing List, Rob Herring, Lorenzo Pieralisi,
	Manivannan Sadhasivam, Krzysztof Wilczynski, Armando Budianto,
	Alviro Iskandar Setiawan, gwml

On Thu, Aug 07, 2025 at 05:51:57AM +0200, Lukas Wunner wrote:
> Kenneth reports early-stage reboots caused by d7d8ab87e3e
> ("PCI: vmd: Switch to msi_create_parent_irq_domain()"):
> 
> https://lore.kernel.org/all/dfa40e48-8840-4e61-9fda-25cdb3ad81c1@panix.com/
> 
> Perhaps you're witnessing the same issue?

Confirmed, reverting that commit works on my machine. I'll try to
further diagnose it and report more details.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  4:54       ` Ammar Faizi
@ 2025-08-07  5:03         ` Nam Cao
  2025-08-07  5:13           ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Nam Cao @ 2025-08-07  5:03 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

On Thu, Aug 07, 2025 at 11:54:12AM +0700, Ammar Faizi wrote:
> On Thu, Aug 07, 2025 at 05:51:57AM +0200, Lukas Wunner wrote:
> > Kenneth reports early-stage reboots caused by d7d8ab87e3e
> > ("PCI: vmd: Switch to msi_create_parent_irq_domain()"):
> > 
> > https://lore.kernel.org/all/dfa40e48-8840-4e61-9fda-25cdb3ad81c1@panix.com/
> > 
> > Perhaps you're witnessing the same issue?
> 
> Confirmed, reverting that commit works on my machine. I'll try to
> further diagnose it and report more details.

Does the diff below help?

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 9bbb0ff4cc15..b679c7f28f51 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -280,10 +280,12 @@ static int vmd_msi_alloc(struct irq_domain *domain, unsigned int virq,
 static void vmd_msi_free(struct irq_domain *domain, unsigned int virq,
 			 unsigned int nr_irqs)
 {
+	struct irq_data *irq_data;
 	struct vmd_irq *vmdirq;
 
 	for (int i = 0; i < nr_irqs; ++i) {
-		vmdirq = irq_get_chip_data(virq + i);
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+		vmdirq = irq_data->chip_data;
 
 		synchronize_srcu(&vmdirq->irq->srcu);
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  5:03         ` Nam Cao
@ 2025-08-07  5:13           ` Ammar Faizi
  2025-08-08 10:59             ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-07  5:13 UTC (permalink / raw)
  To: Nam Cao
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

On Thu, Aug 07, 2025 at 07:03:50AM +0200, Nam Cao wrote:
> Does the diff below help?
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 9bbb0ff4cc15..b679c7f28f51 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -280,10 +280,12 @@ static int vmd_msi_alloc(struct irq_domain *domain, unsigned int virq,
>  static void vmd_msi_free(struct irq_domain *domain, unsigned int virq,
>  			 unsigned int nr_irqs)
>  {
> +	struct irq_data *irq_data;
>  	struct vmd_irq *vmdirq;
>  
>  	for (int i = 0; i < nr_irqs; ++i) {
> -		vmdirq = irq_get_chip_data(virq + i);
> +		irq_data = irq_domain_get_irq_data(domain, virq + i);
> +		vmdirq = irq_data->chip_data;
>  
>  		synchronize_srcu(&vmdirq->irq->srcu);
>  

Yes, it works.

Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>

Thank you for the fix!

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-07  5:13           ` Ammar Faizi
@ 2025-08-08 10:59             ` Ammar Faizi
  2025-08-08 11:19               ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-08 10:59 UTC (permalink / raw)
  To: Nam Cao
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

On Thu, Aug 07, 2025 at 12:13:37PM +0700, Ammar Faizi wrote:
> On Thu, Aug 07, 2025 at 07:03:50AM +0200, Nam Cao wrote:
> > Does the diff below help?
> 
> Yes, it works.

So today, I synced with Linus' master branch again:

  37816488247d ("Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")

and applied your fix on top of it.

I can boot, but I get this splat. Looking at the call trace, it seems
it's still related to pci, but different issue. The call trace is also
different from the previous one.

Let me know if you have something for me to test.

  [    1.017767] pci 10000:e0:1d.0: bridge window [mem 0x82000000-0x820fffff]: assigned
  [    1.018519] pci 10000:e0:1d.0: bridge window [io  size 0x1000]: can't assign; no space
  [    1.019268] pci 10000:e0:1d.0: bridge window [io  size 0x1000]: failed to assign
  [    1.020026] pci 10000:e0:1d.0: bridge window [io  size 0x1000]: can't assign; no space
  [    1.020789] pci 10000:e0:1d.0: bridge window [io  size 0x1000]: failed to assign
  [    1.021539] pci 10000:e1:00.0: BAR 0 [mem 0x82000000-0x82003fff 64bit]: assigned
  [    1.022317] pci 10000:e0:1d.0: PCI bridge to [bus e1]
  [    1.023091] pci 10000:e0:1d.0:   bridge window [mem 0x82000000-0x820fffff]
  [    1.023885] pci 10000:e1:00.0: VMD: Default LTR value set by driver
  [    1.024654] pci 10000:e1:00.0: can't override BIOS ASPM; OS doesn't have ASPM control
  [    1.025442] pcieport 10000:e0:1d.0: can't derive routing for PCI INT A
  [    1.026245] pcieport 10000:e0:1d.0: PCI INT A: no GSI
  [    1.027058] ------------[ cut here ]------------
  [    1.027849] WARNING: CPU: 0 PID: 166 at drivers/pci/controller/vmd.c:309 vmd_init_dev_msi_info+0x36/0x40 [vmd]
  [    1.028649] Modules linked in: hid_generic i2c_hid_acpi i2c_hid drm intel_lpss_pci i2c_i801 i2c_mux intel_lpss idma64 i2c_smbus vmd(+) hid video wmi pinctrl_tigerlake
  [    1.029467] CPU: 0 UID: 0 PID: 166 Comm: systemd-udevd Not tainted 6.16.0-afh-home-2025-08-08-g6026508bdb9d #10 PREEMPT(full)  fe08b908bb15b9ded6f7769c45f204194ebf7eef
  [    1.030301] Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.21 03/21/2022
  [    1.031122] RIP: 0010:vmd_init_dev_msi_info+0x36/0x40 [vmd]
  [    1.031953] Code: 48 89 cb e8 7c 49 4f e1 84 c0 74 1a 48 8b 53 20 48 c7 42 18 70 18 40 a0 48 8b 53 20 48 c7 42 20 e0 17 40 a0 5b c3 31 c0 5b c3 <0f> 0b 31 c0 c3 0f 1f 44 00 00 0f 1f 44 00 00 41 56 41 55 41 54 49
  [    1.032798] RSP: 0018:ffff888105a47860 EFLAGS: 00010297
  [    1.033642] RAX: ffff8881014d5d98 RBX: ffff8881014d5c00 RCX: ffff8881014d5d98
  [    1.034506] RDX: ffff888120a49400 RSI: ffff888120a49400 RDI: ffff88810132b0c8
  [    1.035359] RBP: 0000000000000001 R08: ffff888100d8de3a R09: 00000000ffffffff
  [    1.036206] R10: 0000000000000004 R11: 0000000000000005 R12: ffff8881019b7e40
  [    1.036561] usb 1-2: new full-speed USB device number 3 using xhci_hcd
  [    1.037058] R13: ffffffffa020c680 R14: ffff88810132b0c8 R15: ffff888120a49400
  [    1.038724] FS:  00007f25ede3a8c0(0000) GS:ffff8890f1a2d000(0000) knlGS:0000000000000000
  [    1.039559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [    1.040400] CR2: 00007f25ee475ea0 CR3: 000000011dfd1003 CR4: 0000000000770ef0
  [    1.041265] PKRU: 55555554
  [    1.042121] Call Trace:
  [    1.042971]  <TASK>
  [    1.043803]  msi_create_device_irq_domain+0x1eb/0x290
  [    1.044645]  __pci_enable_msi_range+0x106/0x300
  [    1.045488]  pci_alloc_irq_vectors_affinity+0xc5/0x110
  [    1.046336]  pcie_portdrv_probe+0x24e/0x610
  [    1.047177]  ? kernfs_activate+0x48/0x60
  [    1.048015]  local_pci_probe+0x3c/0x80
  [    1.048854]  pci_device_probe+0xbc/0x1b0
  [    1.049693]  really_probe+0xcd/0x380
  [    1.050529]  ? driver_probe_device+0x90/0x90
  [    1.051353]  __driver_probe_device+0x78/0x150
  [    1.052182]  driver_probe_device+0x1f/0x90
  [    1.053000]  __device_attach_driver+0x76/0xf0
  [    1.053812]  bus_for_each_drv+0x69/0xa0
  [    1.054633]  __device_attach+0xaa/0x1a0
  [    1.055448]  pci_bus_add_device+0x4c/0x70
  [    1.056260]  pci_bus_add_devices+0x2c/0x70
  [    1.057063]  vmd_probe+0x81e/0xa20 [vmd 65bddf00234a3cddd21388091a077f038c9af2be]
  [    1.057881]  local_pci_probe+0x3c/0x80
  [    1.058682]  pci_device_probe+0xbc/0x1b0
  [    1.059489]  really_probe+0xcd/0x380
  [    1.060303]  ? __device_attach_driver+0xf0/0xf0
  [    1.061112]  __driver_probe_device+0x78/0x150
  [    1.061908]  driver_probe_device+0x1f/0x90
  [    1.062711]  __driver_attach+0xbf/0x1b0
  [    1.063500]  bus_for_each_dev+0x64/0xa0
  [    1.064303]  bus_add_driver+0x10a/0x230
  [    1.065100]  driver_register+0x55/0xf0
  [    1.065892]  ? vmd_drv_exit+0x9a0/0x9a0 [vmd 65bddf00234a3cddd21388091a077f038c9af2be]
  [    1.066673]  do_one_initcall+0x31/0x1e0
  [    1.067460]  do_init_module+0x60/0x260
  [    1.068264]  init_module_from_file+0x74/0x90
  [    1.069068]  idempotent_init_module+0xed/0x2c0
  [    1.069853]  __x64_sys_finit_module+0x65/0xd0
  [    1.070630]  do_syscall_64+0x56/0x260
  [    1.071394]  entry_SYSCALL_64_after_hwframe+0x29/0x31
  [    1.072152] RIP: 0033:0x7f25ee53288d
  [    1.072918] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
  [    1.073698] RSP: 002b:00007fff09cbd5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  [    1.074477] RAX: ffffffffffffffda RBX: 000055d9edf36030 RCX: 00007f25ee53288d
  [    1.075254] RDX: 0000000000000000 RSI: 00007f25ee6cc441 RDI: 0000000000000005
  [    1.076042] RBP: 0000000000020000 R08: 0000000000000000 R09: 00007fff09cbd6f0
  [    1.076827] R10: 0000000000000005 R11: 0000000000000246 R12: 00007f25ee6cc441
  [    1.077609] R13: 000055d9edf323b0 R14: 000055d9edf36120 R15: 000055d9edf35850
  [    1.078389]  </TASK>
  [    1.079166] ---[ end trace 0000000000000000 ]---

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-08 10:59             ` Ammar Faizi
@ 2025-08-08 11:19               ` Ammar Faizi
  2025-08-08 13:34                 ` Ammar Faizi
  2025-08-08 18:07                 ` Nam Cao
  0 siblings, 2 replies; 20+ messages in thread
From: Ammar Faizi @ 2025-08-08 11:19 UTC (permalink / raw)
  To: Nam Cao
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

On Fri, Aug 08, 2025 at 05:59:23PM +0700, Ammar Faizi wrote:
> On Thu, Aug 07, 2025 at 12:13:37PM +0700, Ammar Faizi wrote:
> > On Thu, Aug 07, 2025 at 07:03:50AM +0200, Nam Cao wrote:
> > > Does the diff below help?
> > 
> > Yes, it works.
> 
> So today, I synced with Linus' master branch again:
> 
>   37816488247d ("Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
> 
> and applied your fix on top of it.
> 
> I can boot, but I get this splat. Looking at the call trace, it seems
> it's still related to pci, but different issue. The call trace is also
> different from the previous one.

It'll be a bit tricky to bisect this one. Because if I step back to
a bad commit post:

   d7d8ab87e3e ("PCI: vmd: Switch to msi_create_parent_irq_domain()")

I won't be able to boot to reach this new splat :/

I guess I need to apply the fix dirty for each bisection step. But I'll
also need to make sure the current step has the d7d8ab87e3e commit
anchestor before applying.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-08 11:19               ` Ammar Faizi
@ 2025-08-08 13:34                 ` Ammar Faizi
  2025-08-08 18:07                 ` Nam Cao
  1 sibling, 0 replies; 20+ messages in thread
From: Ammar Faizi @ 2025-08-08 13:34 UTC (permalink / raw)
  To: Nam Cao
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

On Fri, Aug 08, 2025 at 06:19:18PM +0700, Ammar Faizi wrote:
> It'll be a bit tricky to bisect this one. Because if I step back to
> a bad commit post:
> 
>    d7d8ab87e3e ("PCI: vmd: Switch to msi_create_parent_irq_domain()")
> 
> I won't be able to boot to reach this new splat :/
> 
> I guess I need to apply the fix dirty for each bisection step. But I'll
> also need to make sure the current step has the d7d8ab87e3e commit
> anchestor before applying.

Unfortunately, it turns out I don't have time to bisect it at the moment.
But if you have something for me to test, I'll give it a try.

I can continue bisecting it later, maybe after rc1 is out if the fix is
still pending.

For now, I revert the PCI merge entirely in my local tree and it
resolves the issue.

  0bd0a41a5120 ("Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pc")

  git revert -m 1 -s 0bd0a41a5120;

This is an interesting case for me because it is my first time reverting
a merge commit :-)

I'll probably need to revert that revert-commit later when the PCI fix
actually arrives in mainline before syncing.

Happy weekend!

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-08 11:19               ` Ammar Faizi
  2025-08-08 13:34                 ` Ammar Faizi
@ 2025-08-08 18:07                 ` Nam Cao
  2025-08-08 22:52                   ` Ammar Faizi
  1 sibling, 1 reply; 20+ messages in thread
From: Nam Cao @ 2025-08-08 18:07 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

Ammar Faizi <ammarfaizi2@gnuweeb.org> writes:

> On Fri, Aug 08, 2025 at 05:59:23PM +0700, Ammar Faizi wrote:
>> On Thu, Aug 07, 2025 at 12:13:37PM +0700, Ammar Faizi wrote:
>> > On Thu, Aug 07, 2025 at 07:03:50AM +0200, Nam Cao wrote:
>> > > Does the diff below help?
>> > 
>> > Yes, it works.
>> 
>> So today, I synced with Linus' master branch again:
>> 
>>   37816488247d ("Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
>> 
>> and applied your fix on top of it.
>> 
>> I can boot, but I get this splat. Looking at the call trace, it seems
>> it's still related to pci, but different issue. The call trace is also
>> different from the previous one.
>
> It'll be a bit tricky to bisect this one. Because if I step back to
> a bad commit post:
>
>    d7d8ab87e3e ("PCI: vmd: Switch to msi_create_parent_irq_domain()")
>
> I won't be able to boot to reach this new splat :/
>
> I guess I need to apply the fix dirty for each bisection step. But I'll
> also need to make sure the current step has the d7d8ab87e3e commit
> anchestor before applying.

There is no point in bisecting before that commit, because the WARN_ON()
is added by that commit, so you wouldn't see anything before that.

The WARN_ONCE() tells us that some devices down the PCI tree are
allocating MSI, but VMD supports MSI-X only.

From the backtrace:
   msi_create_device_irq_domain+0x1eb/0x290
   __pci_enable_msi_range+0x106/0x300
   pci_alloc_irq_vectors_affinity+0xc5/0x110
   pcie_portdrv_probe+0x24e/0x610

It seems MSI-X are allocated first, but fail for some reason. Then
fallback to MSI, which triggers the WARN_ON().

So we need to figure out why MSI-X allocation fail.

I may need to ask you to insert a bunch of printk() to help me pinpoint
the problem. But let me stare at it first..

Nam

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-08 18:07                 ` Nam Cao
@ 2025-08-08 22:52                   ` Ammar Faizi
  2025-08-09  4:34                     ` Nam Cao
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-08 22:52 UTC (permalink / raw)
  To: Nam Cao
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml

+ Adding tglx as he's involved in reviewing that commit.

Context: https://lore.kernel.org/linux-pci/aJXYhfc%2F6DfcqfqF@linux.gnuweeb.org/

On Fri, Aug 08, 2025 at 08:07:03PM +0200, Nam Cao wrote:
> There is no point in bisecting before that commit, because the WARN_ON()
> is added by that commit, so you wouldn't see anything before that.

I didn't notice that. Good you pointed that out earlier.

> The WARN_ONCE() tells us that some devices down the PCI tree are
> allocating MSI, but VMD supports MSI-X only.
> 
> From the backtrace:
>    msi_create_device_irq_domain+0x1eb/0x290
>    __pci_enable_msi_range+0x106/0x300
>    pci_alloc_irq_vectors_affinity+0xc5/0x110
>    pcie_portdrv_probe+0x24e/0x610
> 
> It seems MSI-X are allocated first, but fail for some reason. Then
> fallback to MSI, which triggers the WARN_ON().
> 
> So we need to figure out why MSI-X allocation fail.
> 
> I may need to ask you to insert a bunch of printk() to help me pinpoint
> the problem. But let me stare at it first..

I can do that. Send me a git diff. I'll test it and back with the dmesg
output.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-08 22:52                   ` Ammar Faizi
@ 2025-08-09  4:34                     ` Nam Cao
  2025-08-09 13:28                       ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Nam Cao @ 2025-08-09  4:34 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

On Sat, Aug 09, 2025 at 07:52:30AM +0900, Ammar Faizi wrote:
> I can do that. Send me a git diff. I'll test it and back with the dmesg
> output.

That would be very helpful, thanks!

Please bear with me, this may take a few iterations.

Let's first try the below.


diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 9bbb0ff4cc15..ce477c030990 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -304,11 +304,14 @@ static bool vmd_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
 				  struct irq_domain *real_parent,
 				  struct msi_domain_info *info)
 {
+	pr_err("%s: bus_token=%d\n", __func__, (int)info->bus_token);
 	if (WARN_ON_ONCE(info->bus_token != DOMAIN_BUS_PCI_DEVICE_MSIX))
 		return false;
 
-	if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
+	if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info)) {
+		pr_err("%s:%d err=msi_lib_init\n", __func__, __LINE__);
 		return false;
+	}
 
 	info->chip->irq_enable		= vmd_pci_msi_enable;
 	info->chip->irq_disable		= vmd_pci_msi_disable;
diff --git a/drivers/pci/msi/api.c b/drivers/pci/msi/api.c
index 818d55fbad0d..e5f83036d76f 100644
--- a/drivers/pci/msi/api.c
+++ b/drivers/pci/msi/api.c
@@ -265,6 +265,7 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 	}
 
 	if (flags & PCI_IRQ_MSIX) {
+		pci_err(dev, "%s:%d msix\n", __func__, __LINE__);
 		nvecs = __pci_enable_msix_range(dev, NULL, min_vecs, max_vecs,
 						affd, flags);
 		if (nvecs > 0)
@@ -272,6 +273,7 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 	}
 
 	if (flags & PCI_IRQ_MSI) {
+		pci_err(dev, "%s:%d msi\n", __func__, __LINE__);
 		nvecs = __pci_enable_msi_range(dev, min_vecs, max_vecs, affd);
 		if (nvecs > 0)
 			return nvecs;
@@ -279,6 +281,7 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 
 	/* use INTx IRQ if allowed */
 	if (flags & PCI_IRQ_INTX) {
+		pci_err(dev, "%s:%d INTx\n", __func__, __LINE__);
 		if (min_vecs == 1 && dev->irq) {
 			/*
 			 * Invoke the affinity spreading logic to ensure that
diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
index 0938ef7ebabf..6695508e9e53 100644
--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -13,8 +13,10 @@ int pci_msi_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	struct irq_domain *domain;
 
 	domain = dev_get_msi_domain(&dev->dev);
-	if (domain && irq_domain_is_hierarchy(domain))
+	if (domain && irq_domain_is_hierarchy(domain)) {
+		pci_err(dev, "%s:%d err=\n", __func__, __LINE__);
 		return msi_domain_alloc_irqs_all_locked(&dev->dev, MSI_DEFAULT_DOMAIN, nvec);
+	}
 
 	return pci_msi_legacy_setup_msi_irqs(dev, nvec, type);
 }
@@ -355,6 +357,7 @@ bool pci_msi_domain_supports(struct pci_dev *pdev, unsigned int feature_mask,
 	if (!domain || !irq_domain_is_hierarchy(domain)) {
 		if (IS_ENABLED(CONFIG_PCI_MSI_ARCH_FALLBACKS))
 			return mode == ALLOW_LEGACY;
+		pci_err(pdev, "%s:%d err\n", __func__, __LINE__);
 		return false;
 	}
 
@@ -364,6 +367,7 @@ bool pci_msi_domain_supports(struct pci_dev *pdev, unsigned int feature_mask,
 		 * msi_domain_info::flags is the authoritative source of
 		 * information.
 		 */
+		pci_err(pdev, "%s:%d not msi parent\n", __func__, __LINE__);
 		info = domain->host_data;
 		supported = info->flags;
 	} else {
@@ -374,6 +378,7 @@ bool pci_msi_domain_supports(struct pci_dev *pdev, unsigned int feature_mask,
 		 * per device domain because the parent is never
 		 * expanding the PCI/MSI functionality.
 		 */
+		pci_err(pdev, "%s:%d is msi parent\n", __func__, __LINE__);
 		supported = domain->msi_parent_ops->supported_flags;
 	}
 
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 34d664139f48..028281d51fd0 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -31,19 +31,25 @@ static int pci_msi_supported(struct pci_dev *dev, int nvec)
 	struct pci_bus *bus;
 
 	/* MSI must be globally enabled and supported by the device */
-	if (!pci_msi_enable)
+	if (!pci_msi_enable) {
+		pci_err(dev, "%s:%d err=pci_msi_enable\n", __func__, __LINE__);
 		return 0;
+	}
 
-	if (!dev || dev->no_msi)
+	if (!dev || dev->no_msi) {
+		pci_err(dev, "%s:%d err=no_msi\n", __func__, __LINE__);
 		return 0;
+	}
 
 	/*
 	 * You can't ask to have 0 or less MSIs configured.
 	 *  a) it's stupid ..
 	 *  b) the list manipulation code assumes nvec >= 1.
 	 */
-	if (nvec < 1)
+	if (nvec < 1) {
+		pci_err(dev, "%s:%d err=nvec\n", __func__, __LINE__);
 		return 0;
+	}
 
 	/*
 	 * Any bridge which does NOT route MSI transactions from its
@@ -60,8 +66,10 @@ static int pci_msi_supported(struct pci_dev *dev, int nvec)
 	 * at probe time.
 	 */
 	for (bus = dev->bus; bus; bus = bus->parent)
-		if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI)
+		if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI) {
+			pci_err(dev, "%s:%d bus_flags\n", __func__, __LINE__);
 			return 0;
+		}
 
 	return 1;
 }
@@ -86,8 +94,10 @@ static int pcim_setup_msi_release(struct pci_dev *dev)
 		return 0;
 
 	ret = devm_add_action(&dev->dev, pcim_msi_release, dev);
-	if (ret)
+	if (ret) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, ret);
 		return ret;
+	}
 
 	dev->is_msi_managed = true;
 	return 0;
@@ -671,17 +681,23 @@ static int __msix_setup_interrupts(struct pci_dev *__dev, struct msix_entry *ent
 	struct pci_dev *dev __free(free_msi_irqs) = __dev;
 
 	int ret = msix_setup_msi_descs(dev, entries, nvec, masks);
-	if (ret)
+	if (ret) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, ret);
 		return ret;
+	}
 
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
-	if (ret)
+	if (ret) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, ret);
 		return ret;
+	}
 
 	/* Check if all MSI entries honor device restrictions */
 	ret = msi_verify_entries(dev);
-	if (ret)
+	if (ret) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, ret);
 		return ret;
+	}
 
 	msix_update_entries(dev, entries);
 	retain_and_null_ptr(dev);
@@ -736,8 +752,10 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	}
 
 	ret = msix_setup_interrupts(dev, entries, nvec, affd);
-	if (ret)
+	if (ret) {
+		pci_err(dev, "%s:%d err=setup irqs\n", __func__, __LINE__);
 		goto out_disable;
+	}
 
 	/* Disable INTX */
 	pci_intx_for_msi(dev, 0);
@@ -793,30 +811,40 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
 {
 	int hwsize, rc, nvec = maxvec;
 
-	if (maxvec < minvec)
+	if (maxvec < minvec) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, -ERANGE);
 		return -ERANGE;
+	}
 
 	if (dev->msi_enabled) {
-		pci_info(dev, "can't enable MSI-X (MSI already enabled)\n");
+		pci_err(dev, "can't enable MSI-X (MSI already enabled)\n");
 		return -EINVAL;
 	}
 
-	if (WARN_ON_ONCE(dev->msix_enabled))
+	if (WARN_ON_ONCE(dev->msix_enabled)) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, -EINVAL);
 		return -EINVAL;
+	}
 
 	/* Check MSI-X early on irq domain enabled architectures */
-	if (!pci_msi_domain_supports(dev, MSI_FLAG_PCI_MSIX, ALLOW_LEGACY))
+	if (!pci_msi_domain_supports(dev, MSI_FLAG_PCI_MSIX, ALLOW_LEGACY)) {
+		pci_err(dev, "%s:%d err=ENOTSUPP\n", __func__, __LINE__);
 		return -ENOTSUPP;
+	}
 
 	if (!pci_msi_supported(dev, nvec) || dev->current_state != PCI_D0)
 		return -EINVAL;
 
 	hwsize = pci_msix_vec_count(dev);
-	if (hwsize < 0)
+	if (hwsize < 0) {
+		pci_err(dev, "%s:%d err=hwsize\n", __func__, __LINE__);
 		return hwsize;
+	}
 
-	if (!pci_msix_validate_entries(dev, entries, nvec))
+	if (!pci_msix_validate_entries(dev, entries, nvec)) {
+		pci_err(dev, "%s:%d err=validate entries\n", __func__, __LINE__);
 		return -EINVAL;
+	}
 
 	if (hwsize < nvec) {
 		/* Keep the IRQ virtual hackery working */
@@ -826,21 +854,27 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
 			nvec = hwsize;
 	}
 
-	if (nvec < minvec)
+	if (nvec < minvec) {
+		pci_err(dev, "%s:%d err=ENOSPC\n", __func__, __LINE__);
 		return -ENOSPC;
+	}
 
 	rc = pci_setup_msi_context(dev);
 	if (rc)
 		return rc;
 
-	if (!pci_setup_msix_device_domain(dev, hwsize))
+	if (!pci_setup_msix_device_domain(dev, hwsize)) {
+		pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, -ENODEV);
 		return -ENODEV;
+	}
 
 	for (;;) {
 		if (affd) {
 			nvec = irq_calc_affinity_vectors(minvec, nvec, affd);
-			if (nvec < minvec)
+			if (nvec < minvec) {
+				pci_err(dev, "%s:%d err=%d\n", __func__, __LINE__, -ENOSPC);
 				return -ENOSPC;
+			}
 		}
 
 		rc = msix_capability_init(dev, entries, nvec, affd);
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index d1b68c18444f..44d1c0cd79fa 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -190,6 +190,7 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
 		goto intx_irq;
 
 	/* Try to use MSI-X or MSI if supported */
+	pr_err("%s:%d\n", __func__, __LINE__);
 	if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0)
 		return 0;
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index dc473faadcc8..1771b6a24765 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -229,13 +229,17 @@ static struct irq_domain *__irq_domain_create(const struct irq_domain_info *info
 
 	if (WARN_ON((info->size && info->direct_max) ||
 		    (!IS_ENABLED(CONFIG_IRQ_DOMAIN_NOMAP) && info->direct_max) ||
-		    (info->direct_max && info->direct_max != info->hwirq_max)))
+		    (info->direct_max && info->direct_max != info->hwirq_max))) {
+		pr_err("%s:%d\n err", __func__, __LINE__);
 		return ERR_PTR(-EINVAL);
+	}
 
 	domain = kzalloc_node(struct_size(domain, revmap, info->size),
 			      GFP_KERNEL, of_node_to_nid(to_of_node(info->fwnode)));
-	if (!domain)
+	if (!domain) {
+		pr_err("%s:%d\n err", __func__, __LINE__);
 		return ERR_PTR(-ENOMEM);
+	}
 
 	err = irq_domain_set_name(domain, info);
 	if (err) {
@@ -328,14 +332,18 @@ static struct irq_domain *__irq_domain_instantiate(const struct irq_domain_info
 
 	if (info->dgc_info) {
 		err = irq_domain_alloc_generic_chips(domain, info->dgc_info);
-		if (err)
+		if (err) {
+			pr_err("%s:%d\n err=%d", __func__, __LINE__, err);
 			goto err_domain_free;
+		}
 	}
 
 	if (info->init) {
 		err = info->init(domain);
-		if (err)
+		if (err) {
+			pr_err("%s:%d\n err=%d", __func__, __LINE__, err);
 			goto err_domain_gc_remove;
+		}
 	}
 
 	__irq_domain_publish(domain);
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 9b09ad3f9914..b86ed78be575 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -316,11 +316,14 @@ int msi_setup_device_data(struct device *dev)
 		return 0;
 
 	md = devres_alloc(msi_device_data_release, sizeof(*md), GFP_KERNEL);
-	if (!md)
+	if (!md) {
+		dev_err(dev, "%s:%d err=ENOMEM\n", __func__, __LINE__);
 		return -ENOMEM;
+	}
 
 	ret = msi_sysfs_create_group(dev);
 	if (ret) {
+		dev_err(dev, "%s:%d err=%d\n", __func__, __LINE__, ret);
 		devres_free(md);
 		return ret;
 	}
@@ -1035,16 +1038,22 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
 	const struct msi_parent_ops *pops;
 	struct fwnode_handle *fwnode;
 
-	if (!irq_domain_is_msi_parent(parent))
+	if (!irq_domain_is_msi_parent(parent)) {
+		dev_err(dev, "%s:%d err=not parent\n", __func__, __LINE__);
 		return false;
+	}
 
-	if (domid >= MSI_MAX_DEVICE_IRQDOMAINS)
+	if (domid >= MSI_MAX_DEVICE_IRQDOMAINS) {
+		dev_err(dev, "%s:%d err=max domain\n", __func__, __LINE__);
 		return false;
+	}
 
 	struct msi_domain_template *bundle __free(kfree) =
 		kmemdup(template, sizeof(*bundle), GFP_KERNEL);
-	if (!bundle)
+	if (!bundle) {
+		dev_err(dev, "%s:%d err=mem\n", __func__, __LINE__);
 		return false;
+	}
 
 	bundle->info.hwsize = hwsize;
 	bundle->info.chip = &bundle->chip;
@@ -1077,23 +1086,32 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
 	if (!fwnode)
 		return false;
 
-	if (msi_setup_device_data(dev))
+	if (msi_setup_device_data(dev)) {
+		dev_err(dev, "%s:%d err=setup\n", __func__, __LINE__);
 		return false;
+	}
 
 	guard(msi_descs_lock)(dev);
-	if (WARN_ON_ONCE(msi_get_device_domain(dev, domid)))
+	if (WARN_ON_ONCE(msi_get_device_domain(dev, domid))) {
+		dev_err(dev, "%s:%d err=get domain\n", __func__, __LINE__);
 		return false;
+	}
 
-	if (!pops->init_dev_msi_info(dev, parent, parent, &bundle->info))
+	if (!pops->init_dev_msi_info(dev, parent, parent, &bundle->info)) {
+		dev_err(dev, "%s:%d err=init_msi_info\n", __func__, __LINE__);
 		return false;
+	}
 
 	domain = __msi_create_irq_domain(fwnode, &bundle->info, IRQ_DOMAIN_FLAG_MSI_DEVICE, parent);
-	if (!domain)
+	if (!domain) {
+		dev_err(dev, "%s:%d err=create domain\n", __func__, __LINE__);
 		return false;
+	}
 
 	dev->msi.data->__domains[domid].domain = domain;
 
 	if (msi_domain_prepare_irqs(domain, dev, hwsize, &bundle->alloc_info)) {
+		pr_err("%s:%d\n err=prepare_irqs", __func__, __LINE__);
 		dev->msi.data->__domains[domid].domain = NULL;
 		irq_domain_remove(domain);
 		return false;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09  4:34                     ` Nam Cao
@ 2025-08-09 13:28                       ` Ammar Faizi
  2025-08-09 14:49                         ` Nam Cao
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-09 13:28 UTC (permalink / raw)
  To: Nam Cao
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

On Sat, Aug 09, 2025 at 06:34:29AM +0200, Nam Cao wrote:
> On Sat, Aug 09, 2025 at 07:52:30AM +0900, Ammar Faizi wrote:
> > I can do that. Send me a git diff. I'll test it and back with the dmesg
> > output.
> 
> That would be very helpful, thanks!
> 
> Please bear with me, this may take a few iterations.
> 
> Let's first try the below.

I just got home from a family outing. A bit slow response.

Here's the result:

  https://gist.github.com/ammarfaizi2/ef5f98123ed3868f8d64ed41662edd63#file-dmesg_pci_debug_001-txt-L853

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09 13:28                       ` Ammar Faizi
@ 2025-08-09 14:49                         ` Nam Cao
  2025-08-09 15:15                           ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Nam Cao @ 2025-08-09 14:49 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

On Sat, Aug 09, 2025 at 08:28:39PM +0700, Ammar Faizi wrote:
> On Sat, Aug 09, 2025 at 06:34:29AM +0200, Nam Cao wrote:
> > On Sat, Aug 09, 2025 at 07:52:30AM +0900, Ammar Faizi wrote:
> > > I can do that. Send me a git diff. I'll test it and back with the dmesg
> > > output.
> > 
> > That would be very helpful, thanks!
> > 
> > Please bear with me, this may take a few iterations.
> > 
> > Let's first try the below.
> 
> I just got home from a family outing. A bit slow response.
> 
> Here's the result:
> 
>   https://gist.github.com/ammarfaizi2/ef5f98123ed3868f8d64ed41662edd63#file-dmesg_pci_debug_001-txt-L853

Thanks! Here's the problem:

    [    1.037223] pcieport 10000:e0:1d.0: __pci_enable_msix_range:840 err=hwsize

The PCIe port driver enables interrupt, trying MSI-X first. However, the
device does not support MSI-X, so it tries MSI instead, which triggers
the WARN_ON() in VMD driver.

What's strange is that, the VMD doc says:

    "Intel VMD only supports MSIx Interrupts from child devices and
    therefore the BIOS must enable PCIe Hot Plug and MSIx interrups"

Is it lying to us?

Can you	please try:

    Revert d5c647b08ee0 ("PCI: vmd: Fix wrong kfree() in vmd_msi_free()")
    Revert d7d8ab87e3e7 ("PCI: vmd: Switch to msi_create_parent_irq_domain()")

So that the driver is back to the original state before I touched it.

And apply the diff below. This will tell us if my commit breaks the driver
somehow, or VMD has been allowing MSI all this time.


diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index d9b893bf4e45..e99d8cefb78d 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -255,9 +255,15 @@ static int vmd_msi_init(struct irq_domain *domain, struct msi_domain_info *info,
 			msi_alloc_info_t *arg)
 {
 	struct msi_desc *desc = arg->desc;
+	struct pci_dev *pci_dev = msi_desc_to_pci_dev(desc);
 	struct vmd_dev *vmd = vmd_from_bus(msi_desc_to_pci_dev(desc)->bus);
 	struct vmd_irq *vmdirq = kzalloc(sizeof(*vmdirq), GFP_KERNEL);
 
+	if (!pci_msix_vec_count(pci_dev))
+		pr_err("But VMD only supports MSIx Interrupts from child devices!\n");
+	else
+		pr_err("MSI-X, looking good...\n");
+	dump_stack();
 	if (!vmdirq)
 		return -ENOMEM;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09 14:49                         ` Nam Cao
@ 2025-08-09 15:15                           ` Ammar Faizi
  2025-08-09 15:32                             ` Nam Cao
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-09 15:15 UTC (permalink / raw)
  To: Nam Cao
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

On Sat, Aug 09, 2025 at 04:49:27PM +0200, Nam Cao wrote:
> Thanks! Here's the problem:
> 
>     [    1.037223] pcieport 10000:e0:1d.0: __pci_enable_msix_range:840 err=hwsize
> 
> The PCIe port driver enables interrupt, trying MSI-X first. However, the
> device does not support MSI-X, so it tries MSI instead, which triggers
> the WARN_ON() in VMD driver.
> 
> What's strange is that, the VMD doc says:
> 
>     "Intel VMD only supports MSIx Interrupts from child devices and
>     therefore the BIOS must enable PCIe Hot Plug and MSIx interrups"
> 
> Is it lying to us?
> 
> Can you	please try:
> 
>     Revert d5c647b08ee0 ("PCI: vmd: Fix wrong kfree() in vmd_msi_free()")
>     Revert d7d8ab87e3e7 ("PCI: vmd: Switch to msi_create_parent_irq_domain()")
> 
> So that the driver is back to the original state before I touched it.
> 
> And apply the diff below. This will tell us if my commit breaks the driver
> somehow, or VMD has been allowing MSI all this time.

Here's the result after reverting those two commits and applied the diff.

  https://gist.github.com/ammarfaizi2/03c7a9c0fec2a11f206931f1b7790709#file-dmesg_pci_debug_002-txt

Let's see if this one is enough for you to diagnose the problem.

Note that the previous printk() diff has to be discarded to avoid
conflict in the reverts. If that's still needed, send me a fixed
diff after clean reverts.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09 15:15                           ` Ammar Faizi
@ 2025-08-09 15:32                             ` Nam Cao
  2025-08-09 16:03                               ` Ammar Faizi
  0 siblings, 1 reply; 20+ messages in thread
From: Nam Cao @ 2025-08-09 15:32 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

Ammar Faizi <ammarfaizi2@gnuweeb.org> writes:
> Here's the result after reverting those two commits and applied the diff.
>
>   https://gist.github.com/ammarfaizi2/03c7a9c0fec2a11f206931f1b7790709#file-dmesg_pci_debug_002-txt
>
> Let's see if this one is enough for you to diagnose the problem.

Thanks, I think the problem is clear now. 

The diff I sent you has a mistake, it should be
    if (pci_msix_vec_count(pci_dev) < 0)
not
    if (!pci_msix_vec_count(pci_dev))

So the log is wrong, it printed "MSI-X, looking good...". It should have
printed the other one.

But no need to re-run it, the backtrace is enough.

    MSI-X, looking good... <-------- wrong log
    CPU: 3 UID: 0 PID: 183 Comm: systemd-udevd Not tainted 6.16.0-afh2-dbg-2025-08-09-gb622ab28bcac #13 PREEMPT(full)  28137b57996795286f6544f071ec852674a057d4
    Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.21 03/21/2022
    Call Trace:
     <TASK>
    dump_stack_lvl
    vmd_msi_init
    msi_domain_alloc
    irq_domain_alloc_irqs_locked
    __irq_domain_alloc_irqs
    __msi_domain_alloc_irqs
    msi_domain_alloc_irqs_all_locked
    __msi_capability_init
    __pci_enable_msi_range
    pci_alloc_irq_vectors_affinity
    pcie_portdrv_probe

So unlike what VMD doc says, it actually can have non-MSI-X children devices!

Please discard the reverts and the diff I sent you, and try the diff
below. I believe your machine will work now.

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index b679c7f28f51..1bd5bf4a6097 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -306,9 +306,6 @@ static bool vmd_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
 				  struct irq_domain *real_parent,
 				  struct msi_domain_info *info)
 {
-	if (WARN_ON_ONCE(info->bus_token != DOMAIN_BUS_PCI_DEVICE_MSIX))
-		return false;
-
 	if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
 		return false;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09 15:32                             ` Nam Cao
@ 2025-08-09 16:03                               ` Ammar Faizi
  2025-08-09 16:28                                 ` Nam Cao
  0 siblings, 1 reply; 20+ messages in thread
From: Ammar Faizi @ 2025-08-09 16:03 UTC (permalink / raw)
  To: Nam Cao
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

On Sat, Aug 09, 2025 at 05:32:16PM +0200, Nam Cao wrote:
> So unlike what VMD doc says, it actually can have non-MSI-X children devices!

If that's the conclusion, then Intel VMD doc also needs fixing :/

> Please discard the reverts and the diff I sent you, and try the diff
> below. I believe your machine will work now.

Yes, I can confirm it's now clean. Just to verify both sides, here is
the last result:

  https://gist.github.com/ammarfaizi2/72578d2b4cc385fbdb5faee69013d530

If that one fix is final, then:

Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>

Thanks for the debugging work.

It's probably too late to get the fix in mainline before rc1. But if it
can go upstream sooner, that would be great.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL v2] PCI changes for v6.17
  2025-08-09 16:03                               ` Ammar Faizi
@ 2025-08-09 16:28                                 ` Nam Cao
  0 siblings, 0 replies; 20+ messages in thread
From: Nam Cao @ 2025-08-09 16:28 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Gleixner, Lukas Wunner, Bjorn Helgaas, Linus Torvalds,
	Linux PCI Mailing List, Linux Kernel Mailing List, Rob Herring,
	Lorenzo Pieralisi, Manivannan Sadhasivam, Krzysztof Wilczynski,
	Armando Budianto, Alviro Iskandar Setiawan, gwml, namcaov

Ammar Faizi <ammarfaizi2@gnuweeb.org> writes:

> On Sat, Aug 09, 2025 at 05:32:16PM +0200, Nam Cao wrote:
>> So unlike what VMD doc says, it actually can have non-MSI-X children devices!
>
> If that's the conclusion, then Intel VMD doc also needs fixing :/

Other possibilities are problem with your BIOS (likely), or problem with
Linux's PCI enumeration (unlikely).

But without hardware, I cannot investigate this further. Now that we
know my commit didn't make the driver any worse, I am done here.

>> Please discard the reverts and the diff I sent you, and try the diff
>> below. I believe your machine will work now.
>
> Yes, I can confirm it's now clean. Just to verify both sides, here is
> the last result:
>
>   https://gist.github.com/ammarfaizi2/72578d2b4cc385fbdb5faee69013d530
>
> If that one fix is final, then:
>
> Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
>
> Thanks for the debugging work.

Thanks for running tests, it would be impossible to figure out otherwise.

> It's probably too late to get the fix in mainline before rc1. But if it
> can go upstream sooner, that would be great.

I don't think PCI maintainers are available at the moment, so I will
send the patch next Monday. Time to enjoy my weekends..

Nam

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-08-09 16:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-01 14:22 [GIT PULL v2] PCI changes for v6.17 Bjorn Helgaas
2025-08-01 21:37 ` pr-tracker-bot
2025-08-07  3:34   ` Ammar Faizi
2025-08-07  3:51     ` Lukas Wunner
2025-08-07  4:44       ` Nam Cao
2025-08-07  4:54       ` Ammar Faizi
2025-08-07  5:03         ` Nam Cao
2025-08-07  5:13           ` Ammar Faizi
2025-08-08 10:59             ` Ammar Faizi
2025-08-08 11:19               ` Ammar Faizi
2025-08-08 13:34                 ` Ammar Faizi
2025-08-08 18:07                 ` Nam Cao
2025-08-08 22:52                   ` Ammar Faizi
2025-08-09  4:34                     ` Nam Cao
2025-08-09 13:28                       ` Ammar Faizi
2025-08-09 14:49                         ` Nam Cao
2025-08-09 15:15                           ` Ammar Faizi
2025-08-09 15:32                             ` Nam Cao
2025-08-09 16:03                               ` Ammar Faizi
2025-08-09 16:28                                 ` Nam Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).