Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Nirmal Patel <nirmal.patel@linux.intel.com>
To: sashiko-bot@kernel.org
Cc: sashiko-reviews@lists.linux.dev, linux-pci@vger.kernel.org
Subject: Re: [PATCH v4] PCI: vmd: Add feature to scan BIOS enumerated devices.
Date: Tue, 26 May 2026 12:43:07 -0700	[thread overview]
Message-ID: <20260526124307.00005a70@linux.intel.com> (raw)
In-Reply-To: <20260522154216.AA0D61F000E9@smtp.kernel.org>

On Fri, 22 May 2026 15:42:16 +0000
sashiko-bot@kernel.org wrote:

> Thank you for your contribution! Sashiko AI review found 6 potential
> issue(s) to consider:
> 
> New issues:
> - [High] The driver unconditionally reads from fixed offsets in BAR4
> without verifying that the BAR size is sufficiently large.
> - [High] The PCI bus resource `.end` is unconditionally hardcoded to
> `0xff`, breaking legacy devices with smaller configuration BARs.
> 
> Pre-existing issues:
> - [High] Allocated IRQ vectors are leaked in multiple error paths and
> during device teardown.
> - [High] Per-CPU memory allocated for SRCU structures is leaked on
> error paths.
> - [High] SRCU lock structures are destroyed before the IRQ handlers
> are unregistered, allowing a use-after-free.
> - [High] Subtracting a fixed offset from the resource bounds of
> MEMBAR2 without verifying the actual BAR length causes an integer
> underflow. --
> 
Pre-existing issues are not in scope of the feature in this patch
series.

> commit 0653c66e9c93e9c62213a329c19bf93e4499969f
> Author: Nirmal Patel <nirmal.patel@linux.intel.com>
> Subject: PCI: vmd: Add feature to scan BIOS enumerated devices.
> 
> This commit introduces the VMD_FEAT_USE_BIOS_INFO feature flag for
> device 0x28c1. It parses configuration parameters, bus start ranges,
> and shadow registers directly from the BIOS-populated BAR4/BAR5 MMIO
> space rather than enumerating the tree dynamically.
> 
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c index d4ae250d4bc6d..7c90243b5ff0c
> > 100644 --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c  
> [ ... ]
> > @@ -661,6 +686,46 @@ static int vmd_get_bus_number_start(struct
> > vmd_dev *vmd) return 0;
> >  }
> >  
> > +static int vmd_get_bus_info_from_bar4(struct vmd_dev *vmd,
> > +				       resource_size_t *offset1,
> > +				       resource_size_t *offset2)
> > +{
> > +	u64 phys1, phys2, bar4_2840;
> > +	void __iomem *bar4;
> > +	u32 base_id;
> > +	u8 base_bus;
> > +
> > +
> > +	bar4 = pci_ioremap_bar(vmd->dev, 4);
> > +	if (!bar4)
> > +		return -ENOMEM;
> > +
> > +	/* Read shadow registers for MEMBAR1 and MEMBAR2 physical
> > addresses. */
> > +	phys1 = readq(bar4 + SHADOW_MEMBAR1_28C1);  
> 
> [Severity: High]
> Does this code read out-of-bounds if the hardware presents a smaller
> BAR4?
> 
> pci_ioremap_bar() maps the available resource, but there doesn't
> appear to be a check verifying that the BAR length is at least large
> enough to cover the fixed offsets being read, such as 0x2840
> (BASE_ID_REG_28C1).
> 
> Could a misconfigured or malicious device trigger an out-of-bounds
> MMIO read here?
I dont think so. vmd_get_bus_info_from_bar4 gets called for only device
ID specific flag  VMD_FEAT_USE_BIOS_INFO. BASE_ID_REG_28C is set to
0x2840 which is just 10KB of space and is much lesser than standard
membar2 resource size 2MB. Besides we are doing the same thing in
vmd_get_phys_offsets.
> 
> > +	phys2 = readq(bar4 + SHADOW_MEMBAR2_28C1);  
> [ ... ]
> > @@ -711,6 +776,53 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
> >  	return 0;
> >  }
> >  
> > +static int vmd_prepare_offsets_and_bus(struct vmd_dev *vmd,
> > +					unsigned long features,
> > +					resource_size_t
> > *membar2_offset,
> > +					resource_size_t *offset1,
> > +					resource_size_t *offset2)
> > +{
> > +	int ret;
> > +
> > +	/*
> > +	 * Shadow registers may exist in certain VMD device ids
> > which allow
> > +	 * guests to correctly assign host physical addresses to
> > the root ports
> > +	 * and child devices. These registers will either return
> > the host value
> > +	 * or 0, depending on an enable bit in the VMD device.
> > +	 */
> > +	/*
> > +	 * For certain VMD devices (i.e. 0x28C1), BIOS places
> > device info
> > +	 * in BAR4 shadow registers to determine the base bus
> > number and memory
> > +	 * offsets.
> > +	 */
> > +	if (features & VMD_FEAT_USE_BIOS_INFO) {
> > +		*membar2_offset = MEMBAR2_OFFSET_28C1;  
> 
> [Severity: High]
> This is a pre-existing issue, but can setting this offset cause an
> integer underflow later?
> 
> In vmd_enable_domain(), the resource start is calculated as:
> 
> 	vmd->resources[2] = (struct resource) {
> 		.name  = "VMD MEMBAR2",
> 		.start = res->start + membar2_offset,
> 		.end   = res->end,
> 		...
> 	};
> 
> If the physical MEMBAR2 length provided by the hardware is smaller
> than membar2_offset, start becomes greater than end, which could cause
> resource_size() to underflow and yield a massively inflated memory
> window.
membar2_offset is 0x2000 and it is required to leave such space to
successfully map other VMD hardware registers including MSI-X. So
resource_size() will never be less than 0x2000 for any vmd device since
the minimum possible membar2 size is 2MB set in BIOS.
> 
> > +		ret = vmd_get_bus_info_from_bar4(vmd, offset1,
> > offset2);
> > +		if (ret)
> > +			return ret;
> > +	} else if (features & VMD_FEAT_HAS_MEMBAR_SHADOW) {  
> [ ... ]
> > @@ -784,38 +896,16 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features) struct pci_dev *dev;
> >  	int ret;
> >  
> > -	/*
> > -	 * Shadow registers may exist in certain VMD device ids
> > which allow  
> [ ... ]
> > -	if (features & VMD_FEAT_HAS_BUS_RESTRICTIONS) {
> > -		ret = vmd_get_bus_number_start(vmd);
> > -		if (ret)
> > -			return ret;
> > -	}
> > +	ret = vmd_prepare_offsets_and_bus(vmd, features,
> > &membar2_offset,
> > +					  &offset[0], &offset[1]);
> > +	if (ret)
> > +		return ret;
> >  
> >  	res = &vmd->dev->resource[VMD_CFGBAR];
> >  	vmd->resources[0] = (struct resource) {
> >  		.name  = "VMD CFGBAR",
> >  		.start = vmd->busn_start,
> > -		.end   = vmd->busn_start + (resource_size(res) >>
> > 20) - 1,
> > +		.end   = 0xff,  
> 
> [Severity: High]
> Does unconditionally setting the end to 0xff cause a regression for
> legacy VMD devices with smaller configuration BARs?
> 
> Older devices on client platforms might have smaller CFGBARs.
> Claiming buses up to 255 might cause the PCI subsystem to erroneously
> assign bus numbers that fall outside the hardware's actual ECAM
> capabilities.
If i keep existing implementation (like v1), the .end value can go
out-of-bound (i.e. 478).
Keeping just .end = (resource_size(res) >> 20) - 1 will result in a
crash when busn_start is a high value like 128 and .end is set to 32.
So like other places in pci code-base (i.e. pci/probe.c), it is better
to hard code it to 0xff.
> 
> >  		.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
> >  	};
> >  
> > @@ -868,19 +958,21 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> >  	 * acceptable because the guest is usually CPU-limited and
> > MSI
> >  	 * remapping doesn't become a performance bottleneck.
> >  	 */
> > -	if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> > -	    offset[0] || offset[1]) {
> > -		ret = vmd_alloc_irqs(vmd);
> > -		if (ret)
> > -			return ret;
> > +	if (!(features & VMD_FEAT_USE_BIOS_INFO)) {
> > +		if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> > +		    offset[0] || offset[1]) {
> > +			ret = vmd_alloc_irqs(vmd);
> > +			if (ret)
> > +				return ret;
> >  
> > -		vmd_set_msi_remapping(vmd, true);
> > +			vmd_set_msi_remapping(vmd, true);
> >  
> > -		ret = vmd_create_irq_domain(vmd);
> > -		if (ret)
> > -			return ret;
> > +			ret = vmd_create_irq_domain(vmd);
> > +			if (ret)
> > +				return ret;  
> 
> [Severity: High]
> This is a pre-existing issue, but are IRQ vectors leaked when errors
> occur here?
No changes are being done here as part of this feature patch series.
The existing issue needs to be reproduced and if needed, then to be
fixed in separate patches.
> 
> The PCI subsystem guidelines mandate calling pci_free_irq_vectors()
> in error paths after a successful pci_alloc_irq_vectors(). If
> vmd_create_irq_domain() fails, we return directly without freeing the
> vectors allocated in vmd_alloc_irqs(). This same cleanup omission
> appears to exist in vmd_remove() as well, potentially leading to
> system IRQ capacity exhaustion.
> 
> Additionally, if vmd_alloc_irqs() fails midway through
> devm_request_irq(), does it return without cleaning up the
> successfully initialized SRCU structures from previous loop
> iterations? The per-CPU memory allocated by init_srcu_struct() might
> be leaked.
> 
> [Severity: High]
> Also a pre-existing issue, but does using devm_request_irq() for the
> handlers cause a use-after-free during device teardown?
The issue has not be reported or seen in any vmd generations.
No changes are being done here as part of this feature patch series.
The existing issue needs to be reproduced and if needed, then to be
fixed in separate patches.
> 
> In vmd_remove(), the driver calls vmd_cleanup_srcu() which destroys
> the SRCU structures. However, since the IRQs are managed by devres,
> they aren't deregistered until after vmd_remove() returns.
> 
> If an interrupt fires in this window, vmd_irq() would call
> srcu_read_lock() on an already-destroyed SRCU structure.
> 
> > +		} else {
> > +			vmd_set_msi_remapping(vmd, false);
> > +		}
> >  	}  
> 


  reply	other threads:[~2026-05-26 19:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 15:08 [PATCH v4] PCI: vmd: Add feature to scan BIOS enumerated devices Nirmal Patel
2026-05-22 15:42 ` sashiko-bot
2026-05-26 19:43   ` Nirmal Patel [this message]
2026-05-27 19:01 ` Nirmal Patel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260526124307.00005a70@linux.intel.com \
    --to=nirmal.patel@linux.intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-bot@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox