public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] PCIe bridge resource allocation creates invalid limit addresses after Secondary Bus Reset recovery
@ 2026-03-11 22:00 Shawn Jin
  2026-03-11 23:19 ` Bjorn Helgaas
  0 siblings, 1 reply; 12+ messages in thread
From: Shawn Jin @ 2026-03-11 22:00 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3938 bytes --]

Hello,

I'm reporting a potential critical bug in the Linux kernel's PCIe resource allocation code that creates invalid bridge window limit addresses during hotplug re-enumeration after Secondary Bus Reset (SBR) recovery.

## AFFECTED KERNEL VERSIONS
- Confirmed: 5.15.0, 6.8.0 (Ubuntu 6.8.0-88-generic, 6.8.0-90-generic)
- Likely affected: All recent kernels including 6.19

## HARDWARE CONFIGURATION
Intel Ice Lake server with PCIe Gen5 switches and endpoints:

Topology 1:
  Root Port 96:01.0 → 98:00.0 → 99:01.0 → 9b:00.0 (NVIDIA L20 GPU)

Kernel parameter: pci=realloc=on

## PROBLEM DESCRIPTION

After performing Secondary Bus Reset on a PCIe switch port and clearing the reset bit, the kernel re-enumerates devices and assigns bridge window resources. However, the assigned memory window limit addresses are INVALID according to PCIe specification.

### Evidence from dmesg (Topology 1):

**Before SBR (correct allocation):**
```
[    6.636493] pci 0000:98:00.0: PCI bridge to [bus 99-9c]
[    6.636539] pci 0000:98:00.0:   bridge window [mem 0xe9600000-0xe96fffff]
[    6.636645] pci 0000:98:00.0:   bridge window [mem 0x13b000000000-0x13b7ffffffff 64bit pref]

[    6.644429] pci 0000:99:01.0: PCI bridge to [bus 9b]
[    6.644476] pci 0000:99:01.0:   bridge window [mem 0xe9600000-0xe96fffff]
[    6.644656] pci 0000:99:01.0:   bridge window [mem 0x13b000000000-0x13b7ffffffff 64bit pref]

[    6.654203] pci 0000:9b:00.0: [1e3e:0002] type 00 class 0x120000 PCIe Endpoint
[    6.654652] pci 0000:9b:00.0: BAR 0 [mem 0x13b000000000-0x13b7ffffffff 64bit pref]
[    6.654666] pci 0000:9b:00.0: BAR 2 [mem 0xe9600000-0xe963ffff]
```

**After SBR clear (INVALID allocation):**
```
[  656.644184] pci 0000:98:00.0: bridge window [mem 0x13b000000000-0x13b7ffffffff 64bit pref]: assigned
[  656.644186] pci 0000:98:00.0: bridge window [mem 0xe9600000-0xe96fffff]: assigned
[  656.644188] pci 0000:99:01.0: bridge window [mem 0x13b000000000-0x13b7fffffffe 64bit pref]: assigned
[  656.644189] pci 0000:99:01.0: bridge window [mem 0xe9600000-0xe96ffffe]: assigned

[  656.644830] pci 0000:9b:00.0: BAR 0 [mem size 0x800000000 64bit pref]: can't assign; no space
[  656.644831] pci 0000:9b:00.0: BAR 0 [mem size 0x800000000 64bit pref]: failed to assign
// BAR2 can still be assigned because the size is only 256KB, while the min window in the bridge is 1MB
[  656.644832] pci 0000:9b:00.0: BAR 2 [mem 0xe9600000-0xe963ffff]: assigned

```

### Invalid Addresses Created by Kernel:
- `0x13b7ffffffff` (ends in 0xFFFE - **2 bytes short**)
- `0xe96ffffe`  (ends in 0xFFFE - **2 bytes short**)

## IMPACT

1. **Device initialization failure**: Endpoints cannot allocate required BARs
   ```
[  656.644830] pci 0000:9b:00.0: BAR 0 [mem size 0x800000000 64bit pref]: can't assign; no space
[  656.644831] pci 0000:9b:00.0: BAR 0 [mem size 0x800000000 64bit pref]: failed to assign
   ```

2. **Consistent across multiple hierarchies**: Affects different PCIe topologies independently

## REPRODUCTION

The attached script test_rc_sbr.sh.txt issues a SBR to the root port.

## SUSPECTED ROOT CAUSE

The bug appears to be in `drivers/pci/setup-bus.c`, likely in:
- `pci_bus_distribute_available_resources()`
- `adjust_bridge_window()`
- `pci_assign_unassigned_bridge_resources()`

The resource end address calculation appears to perform multiple subtractions:
1. Initial calculation: `res->end = res->start + size - 1` (correct)
2. During redistribution: Another subtraction occurs, creating `res->end = ... - 2`

## WORKAROUND ATTEMPTS

- `pci=realloc=on`: Does NOT fix the issue
- Manual remove/rescan from root: Does NOT fix the issue
- Initial boot allocation: Works correctly (bug only occurs during hotplug re-enumeration)

## REQUEST

I want to track how the bridge windows are allocated. Is there a way to enable additional kernel messages to show the path? Please investigate if this is a real kernel bug.

Thank you,
Shawn

[-- Attachment #2: test_rc_sbr.sh.txt --]
[-- Type: text/plain, Size: 1347 bytes --]

#!/bin/bash

# Function to display usage
usage() {
    echo "Usage: $0 -rp <ROOT_PORT_BDF> -usp <USP_BDF>"
    echo "Example: $0 -rp c6:01.0 -usp c7:00.0"
    exit 1
}

# Initialize variables
ROOT_PORT_BDF=""
USP_BDF=""

# Parse command-line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -rp)
            ROOT_PORT_BDF="$2"
            shift 2
            ;;
        -usp)
            USP_BDF="$2"
            shift 2
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo "Unknown option: $1"
            usage
            ;;
    esac
done

# Validate that both arguments are provided
if [ -z "$ROOT_PORT_BDF" ] || [ -z "$USP_BDF" ]; then
    echo "Error: Both -rp and -usp arguments are required"
    usage
fi

echo "Root Port BDF: $ROOT_PORT_BDF"
echo "USP BDF: $USP_BDF"
echo ""

# Remove the USP device
echo 1 | sudo tee /sys/bus/pci/devices/0000:${USP_BDF}/remove

# Trigger SBR via Bridge Control register
BRIDGE_CTL=$(sudo setpci -s ${ROOT_PORT_BDF} 0x3E.w)
BRIDGE_CTL_RESET=$(printf "0x%04x" $((0x$BRIDGE_CTL | 0x0040)))

echo "Asserting Secondary Bus Reset..."
sudo setpci -s ${ROOT_PORT_BDF} 0x3E.w=$BRIDGE_CTL_RESET
sleep 1

echo "De-asserting Secondary Bus Reset..."
sudo setpci -s ${ROOT_PORT_BDF} 0x3E.w=$BRIDGE_CTL
sleep 2

# Rescan PCI bus
echo 1 | sudo tee /sys/bus/pci/rescan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-03-16 17:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 22:00 [BUG] PCIe bridge resource allocation creates invalid limit addresses after Secondary Bus Reset recovery Shawn Jin
2026-03-11 23:19 ` Bjorn Helgaas
2026-03-12  0:02   ` Shawn Jin
2026-03-12  1:03     ` Shawn Jin
2026-03-12 13:24       ` Ilpo Järvinen
2026-03-12 17:14         ` Shawn Jin
2026-03-12 17:48           ` Ilpo Järvinen
2026-03-13 16:48             ` Shawn Jin
2026-03-16 10:28               ` Ilpo Järvinen
2026-03-16 17:26                 ` Shawn Jin
2026-03-12 17:34     ` Bjorn Helgaas
2026-03-12 17:40       ` Shawn Jin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox