Linux PCI subsystem development
 help / color / mirror / Atom feed
From: <dan.j.williams@intel.com>
To: Michael Kelley <mhklinux@outlook.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"bhelgaas@google.com" <bhelgaas@google.com>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"lukas@wunner.de" <lukas@wunner.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Rob Herring <robh@kernel.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"Wei Liu" <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	"open list:Hyper-V/Azure CORE AND DRIVERS"
	<linux-hyperv@vger.kernel.org>
Subject: RE: [PATCH 2/3] PCI: Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms
Date: Thu, 17 Jul 2025 12:59:05 -0700	[thread overview]
Message-ID: <68795609847f7_137e6b100d8@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <SN6PR02MB4157ADD06608EFE00B86A3F7D451A@SN6PR02MB4157.namprd02.prod.outlook.com>

Michael Kelley wrote:
> From: Dan Williams <dan.j.williams@intel.com> Sent: Wednesday, July 16, 2025 9:09 AM

Thanks for taking a look Michael!

[..]
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index e9448d55113b..833ebf2d5213 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -6692,9 +6692,50 @@ static void pci_no_domains(void)
> >  #endif
> >  }
> > 
> > +#ifdef CONFIG_PCI_DOMAINS
> > +static DEFINE_IDA(pci_domain_nr_dynamic_ida);
> > +
> > +/*
> > + * Find a free domain_nr either allocated by pci_domain_nr_dynamic_ida or
> > + * fallback to the first free domain number above the last ACPI segment number.
> > + * Caller may have a specific domain number in mind, in which case try to
> > + * reserve it.
> > + *
> > + * Note that this allocation is freed by pci_release_host_bridge_dev().
> > + */
> > +int pci_bus_find_emul_domain_nr(int hint)
> > +{
> > +	if (hint >= 0) {
> > +		hint = ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint,
> > +				       GFP_KERNEL);
> 
> This almost preserves the existing functionality in pci-hyperv.c. But if the
> "hint" passed in is zero, current code in pci-hyperv.c treats that as a
> collision and allocates some other value. The special treatment of zero is
> necessary per the comment with the definition of HVPCI_DOM_INVALID.
> 
> I don't have an opinion on whether the code here should treat a "hint"
> of zero as invalid, or whether that should be handled in pci-hyperv.c.

Oh, I see what you are saying. I made the "hint == 0" case start working
where previously it should have failed. I feel like that's probably best
handled in pci-hyperv.c with something like the following which also
fixes up a regression I caused with @dom being unsigned:

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index cfe9806bdbe4..813757db98d1 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3642,9 +3642,9 @@ static int hv_pci_probe(struct hv_device *hdev,
 {
 	struct pci_host_bridge *bridge;
 	struct hv_pcibus_device *hbus;
-	u16 dom_req, dom;
+	int ret, dom = -EINVAL;
+	u16 dom_req;
 	char *name;
-	int ret;
 
 	bridge = devm_pci_alloc_host_bridge(&hdev->device, 0);
 	if (!bridge)
@@ -3673,7 +3673,8 @@ static int hv_pci_probe(struct hv_device *hdev,
 	 * collisions) in the same VM.
 	 */
 	dom_req = hdev->dev_instance.b[5] << 8 | hdev->dev_instance.b[4];
-	dom = pci_bus_find_emul_domain_nr(dom_req);
+	if (dom_req)
+		dom = pci_bus_find_emul_domain_nr(dom_req);
 
 	if (dom < 0) {
 		dev_err(&hdev->device,

> > +
> > +		if (hint >= 0)
> > +			return hint;
> > +	}
> > +
> > +	if (acpi_disabled)
> > +		return ida_alloc(&pci_domain_nr_dynamic_ida, GFP_KERNEL);
> > +
> > +	/*
> > +	 * Emulated domains start at 0x10000 to not clash with ACPI _SEG
> > +	 * domains.  Per ACPI r6.0, sec 6.5.6,  _SEG returns an integer, of
> > +	 * which the lower 16 bits are the PCI Segment Group (domain) number.
> > +	 * Other bits are currently reserved.
> > +	 */
> 
> Back in 2018 and 2019, the Microsoft Linux team encountered problems with
> PCI domain IDs that exceeded 0xFFFF. User space code, such as the Xorg X server,
> assumed PCI domain IDs were at most 16 bits, and retained only the low 16 bits
> if the value was larger. My memory of the details is vague, but I believe some
> or all of this behavior was tied to libpciaccess. As a result of these user space
> limitations, the pci-hyperv.c code made sure to not create any domain IDs
> larger than 0xFFFF. The problem was not just theoretical -- Microsoft had
> customers reporting issues due to the "32-bit domain ID problem" and the
> pci-hyperv.c code was updated to avoid it.
> 
> I don't have information on whether user space code has been fixed, or
> the extent to which such a fix has propagated into distro versions. At the
> least, a VM with old user space code might break if the kernel is upgraded
> to one with this patch. How do people see the risks now that it is 6 years
> later? I don't have enough data to make an assessment.

A couple observations:

- I think it would be reasonable to not fallback in the hint case with
  something like this:

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 833ebf2d5213..0bd2053dbe8a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6705,14 +6705,10 @@ static DEFINE_IDA(pci_domain_nr_dynamic_ida);
  */
 int pci_bus_find_emul_domain_nr(int hint)
 {
-	if (hint >= 0) {
-		hint = ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint,
+	if (hint >= 0)
+		return ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint,
 				       GFP_KERNEL);
 
-		if (hint >= 0)
-			return hint;
-	}
-
 	if (acpi_disabled)
 		return ida_alloc(&pci_domain_nr_dynamic_ida, GFP_KERNEL);
 
- The VMD driver has been allocating 32-bit PCI domain numbers since
  v4.5 185a383ada2e ("x86/PCI: Add driver for Intel Volume Management
  Device (VMD)"). At a minimum if it is still a problem, it is a shared
  problem, but the significant deployment of VMD in the time likely
  indicates it is ok. If not, the above change at least makes the
  hyper-v case avoid 32-bit domain numbers.

  reply	other threads:[~2025-07-17 19:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-16 16:08 [PATCH 0/3] PCI: Unify domain emulation and misc documentation update Dan Williams
2025-07-16 16:08 ` [PATCH 1/3] PCI: Establish document for PCI host bridge sysfs attributes Dan Williams
2025-07-16 16:08 ` [PATCH 2/3] PCI: Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms Dan Williams
2025-07-17 17:25   ` Michael Kelley
2025-07-17 19:59     ` dan.j.williams [this message]
2025-07-17 23:06       ` Michael Kelley
2025-07-18  0:22         ` dan.j.williams
2025-07-18  3:03           ` Michael Kelley
2025-07-18 19:17             ` dan.j.williams
2025-07-16 16:08 ` [PATCH 3/3] PCI: vmd: Switch to pci_bus_find_emul_domain_nr() Dan Williams
2025-07-17 22:13 ` [PATCH 0/3] PCI: Unify domain emulation and misc documentation update Bjorn Helgaas
2025-07-18  0:26 ` dan.j.williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68795609847f7_137e6b100d8@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=bhelgaas@google.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=lukas@wunner.de \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=mhklinux@outlook.com \
    --cc=robh@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox